Election Disinformation in Different Languages is a Big Problem in the U.S.

October 18, 2022 / Aliya Bhatia

And it’s driving a wedge between voters in non-English communities

Mis- and disinformation about elections predate the endless scroll of modern social media services. [1] Yet easy access to online information channels and amplification tools enable false narratives to spread at a massive scale.

When false narratives are combined with data voids and unique cultural contexts, people with malicious intentions are able to expose voters to disinformation with precision and wide-ranging effects. Communities of color and those that don’t speak English as their primary language sit at this intersection.

The Census Bureau projects that, in 2050, white Americans will be a minority community in the U.S. This shift in demography will likely bring a shift, too, in speech, language, and culture. (Further proof these changes are already forthcoming: in the past 40 years, the use of a primary language other than English in the U.S. increased by 194 percent.) These shifts mean the priorities for the U.S. media and information environment will change as well. These demographic shifts have also animated racist and nativist electoral strategies and disinformation intentionally designed to mislead.

This complicates the study of electoral disinformation while also highlighting its urgency. As the U.S. midterm elections near, many different stakeholders are working to counter the disparate impacts of election disinformation, from Congressional committees holding hearings and roundtables to national advocacy groups calling on social media companies to increase transparency around content policy enforcement. Existing studies point to three main factors which make non-English online election disinformation particularly prevalent.

1. Data voids — a lack of easily available truthful information — create an opening for disinformation.

Local media and voting information is hard to find in the U.S. in languages other than English. As a result, data voids emerge both online and off, where correct information is hard to find but the demand for election information is high. In the absence of accessible verified and accurate information, individuals fill these voids using popular social media platforms where false information in the languages these individuals speak proliferates.

High rates of participation on social media platforms create openings for mis- and disinformation narratives to gain steam. Research conducted by Equis Labs found that Latinx communities access YouTube as a news source twice as much as non-Latinx adults. And WhatsApp, along with other private messaging platforms, are often the main places for news and political discussion for Asian and other immigrant communities.

Presence on social media platforms, coupled with design features that enable mass sharing (like the forward function on WhatsApp), compound network effects of false narratives. The Latino Anti-Disinformation Lab found that 66% of Latinx respondents whose primary language is Spanish say they received wrong and harmful information about the COVID-19 vaccine through messaging apps.

2. Platforms may not enforce their content policies equally across languages, leading to greater non-English disinformation circulating online.

Ahead of the U.S. midterm elections, social media companies have unveiled their plans to counter election disinformation by investing in new technology to detect disinformation and “pre-bunk” false narratives in multiple languages. They are also introducing warning labels or other content-level interventions. Yet, platforms fall short in enforcing these plans equitably. Lack of meaningful transparency into how companies enforce their policies and concerning examples of disinformation on platforms continue to contribute to skepticism about how they deploy their resources in different language communities. In 2020, Avaaz found Facebook failed to issue warning labels on 70% of misinformation in Spanish compared to only 29% in English.

Recent research published by companies like Meta and Google offer a glimpse into their plans to tackle non-English disinformation using automated content analysis tools. Meta has recently deployed a “few-shot learning” tool to identify harmful content in more than 100 languages. Meta claims this tool can pick out new strains of disinformation with only a handful, or even zero, examples to learn from. YouTube also uses machine learning to train its recommendation systems to detect harmful misinformation and reduce recommendations of potentially harmful content in non-English markets.

Platforms operate at a scale that precludes them from having human moderators review each post for disinformation, but their automated tools are not a panacea. CDT and many others have demonstrated issues with how well these automated tools work to handle the complexities of human speech.

These tools may perform especially poorly in languages like Spanish and Arabic, with their significant geographic and regional variations, or Haitian Creole and Tagalog, which are not as well-resourced (meaning there are fewer written, digitized documents in these languages, making it much more difficult to develop data sets to train automated systems than English). Indigenous languages are even more sparsely documented online due, in part, to limited, if non-existent, digital representation.

Just as with English, non-English language disinformation can also be context-specific, which enables it to circumvent automated systems. For example, posts that contain false information and refer to the COVID vaccine as “la puya” (which means “jab” or “poke” in some contexts) in Spanish, rather than “la vacuna,” (which means “vaccine”) are more likely to go undetected by systems used to flag disinformation because automated systems can’t always capture slang.

Groups assert that companies’ lack of cultural competencies and familiarity with non-English languages may exacerbate the likelihood of these posts circumventing human-review systems and make at-scale policy enforcement difficult. In a forthcoming report, CDT will be diving into these questions further as we look at the capabilities and limitations of automated content analysis systems that are deployed to analyze non-English content.

3. Micro-targeting on language, race, ethnicity, income, and other proxies enables disinformation to drive a wedge between voters.

A recent piece by Morgan Jerkins in Mother Jones points out the ways in which bad actors, both domestic and foreign, leverage existing fault lines of racism and division to target individuals with disinformation. Jerkins expresses the critical point that false information targeting individuals along the lines of race, gender, class and other identity markers is able to proliferate and gain a foothold due to the history of racism and division in the country.

Research conducted by groups like Stop Online Violence Against Women shows how influence operations targeting Black voters ahead of the 2020 election used African American English with the intent to split the vote and disenfranchise Black voters. Using African American English, a language distinct from ‘standard English’, was a blatant way for bad actors to masquerade as being within the community, take advantage of data voids, and target the Black community and circumvent content moderation efforts.

Similarly, the Asian American Disinformation Table highlights the ways bad actors target Asian communities with in-group language and references to drive wedges between Asian voters and other communities, to sow division among these voters, and ultimately to disenfranchise them. In-group language fluency and representation is, therefore, critical to understand in order to counter efforts to disenfranchise.

One solution to help slow the spread of targeted disinformation is to create friction that can limit the ability for bad actors to leverage micro-targeting tools to reach particularly vulnerable audiences. Congress can do this by passing a federal privacy law to curb the mass data collection and retention that is at the core of micro-targeted messages online. In the absence of federal privacy protections, targeted disinformation can be bespoke, or what the Brennan Center calls a lie just for you: something tailored and targeted at audiences in the middle of data voids.

How do we address the problem of non-English language election disinformation?

A number of groups are employing key strategies and tactics to tackle the problem.

Monitoring and tracking examples of disinformation

Several groups in the election integrity ecosystem have begun monitoring projects to track the rampant spread of election disinformation in their communities. Some examples include the Election Integrity Partnership, the Disinfo Defense League, and the Latino Anti-Disinformation Lab. Groups like Common Cause have also opened up hotlines for individuals who encounter disinformation in languages like Spanish, Haitian Creole, Arabic, and other Asian and Pacific Islander languages to report it at ReportDisinfo.org. Asian Americans Advancing Justice (AAJC) and Equality Labs are some of the groups which co-steer the Asian American Disinformation Table, and the National Association of Latino Elected and Appointed Officials (NALEO) has recently launched Defiende La Verdad (“Defend the Truth” in Spanish), a campaign to counter disinformation. Through these efforts, groups at the forefront of the fight against false narratives are able to move quickly to identify new and existing narratives in disinformation and counter them with accurate information through trusted messengers.

Pushing for meaningful transparency and equal policy enforcement

Groups like Common Cause, Equis Labs, and the Asian American Disinformation Table (among others) have also called on platforms to hire more staff with the cultural competency required to understand the nuances of non-English language disinformation. The second iteration of the Santa Clara Principles, which CDT helped draft, explicitly guides companies to take cultural competency and language capabilities into consideration when enforcing their policies. Platforms can and should include information about the rates of policy enforcement across different languages in their transparency reports in order to boost meaningful transparency and trust.

Conducting more research into the disparate impacts of election disinformation

Reports published by groups like Equis Labs, the Asian American Disinformation Table, and First Draft News provide a vivid portrayal of the disparate and discriminatory impacts of electoral disinformation. Research on the effects of disinformation is predominantly Anglo-centric, and more research is necessary to expand our collective understanding of the broader impacts of election disinformation.

Advocates have also called for the research community to diversify and more adequately represent the communities affected by particular strains of election disinformation across class, race, migration history, language spoken and more. CDT has done work on this topic, including publishing a research agenda to understand the impacts of disinformation along race and gender lines and recommending some concrete steps to enable greater access to data for researchers, civil society organizations, and journalists to study the impacts of online disinformation on different communities.

CDT joins many of the above-mentioned groups in supporting comprehensive federal privacy legislation such as the bipartisan American Data Privacy and Protection Act (ADPPA). Stemming the flow of private data to bad actors is a crucial first step in limiting bad actors’ ability to target false and dangerous information to specific individuals based on their immutable characteristics.

[1] For the purpose of this blog post, we use the term “Disinformation” as a catchall for all the different types of false, misleading, and deceptive narratives that circulate and are targeted at individuals around election periods. Other groups have proposed the use of the terms mis-, dis-, and mal-information to capture intent. You can read more about these terms here: https://www.mediadefence.org/ereader/publications/introductory-modules-on-digital-rights-and-freedom-of-expression-online/module-8-false-news-misinformation-and-propaganda/misinformation-disinformation-and-mal-information/

Election Disinformation in Different Languages is a Big Problem in the U.S.

Related Reading

Context Before Code: Meta’s Oversight Board Policy Advisory Opinion on the Word “Shaheed” Calls for Language and Cultural Nuance in Content Moderation

EU Tech Policy Brief: March 2024

CDT Hosts Roundtable on Generative AI and Elections with U.S. Department of State Bureau of Cyberspace and Digital Policy