Skip to Content

CDT Research, Free Expression

Investigating Content Moderation Systems in the Global South

Every day, millions of people post content online on various platforms. People sometimes post their joyful life events or, at other times, they share their saddest and most vulnerable moments. Some of this content can also be misleading, spammy, and illegal. For years, tech companies, which cater to a diverse global audience with varying languages and dialects, have struggled to moderate the scale of content on their services. In response to the immense pressure to promptly remove any content that violates laws or their own policies, tech companies have implemented various methods to moderate user-generated content by expanding their human-led and automated moderation systems. However, the vast majority of their efforts have been directed toward English language content, primarily driven by factors such as cost considerations, technical constraints, and a lack of motivation. This has meant that non-English content is disproportionately subject to “inconsistent moderation.” 

In previous work, CDT has examined the ways tech companies design models to analyze text across languages. Building on this, CDT is launching a new project that focuses on content moderation policies and measures in the Global South. This research will critically examine how content moderation systems operate in non-English contexts, particularly in indigenous and other languages of the Majority World (i.e., the Global South.) Using a mixed-method approach that combines qualitative and quantitative methods, this project will address this question: How do content moderation systems across various platforms operate in indigenous and other language contexts of the Global South? To answer this question, we will examine content moderation in languages from three regions: (1) South America, (2) Africa, and (3) South Asia.

Over the years, evidence has shown that tech companies implement policies and practices in moderating non-English language content that can have the effect of impeding individuals from speaking freely or accessing information in their native language. For example, according to one whistleblower, Facebook allocates 87% of its spending on misinformation countermeasures to English content, despite only 9% of its users being English speakers. Notably, that disparity is even more pronounced in the case of low-resource languages — those characterized by limited availability and poor-quality datasets for training language models. The lack of resources and content moderation practices have inadvertently resulted in the suspension of user accounts and/or content, the supercharging of hate speech, and the proliferation of misleading content in the Global South. Hence, it is critically important to explore the ways tech companies moderate non-English languages.

Content moderation encompasses multiple interrelated processes that involve: definition, detection, evaluation, enforcement, appeal, and education. In the definition stage, user-generated content is identified as either permitted or not on the service. Subsequently, a combination of proactive and reactive methods is used to detect impermissible content, which is then evaluated by human moderators, automated systems, or a combination of both. As a result, multiple enforcement actions are employed against the violating user-generated content while granting users an opportunity to appeal these decisions. Lastly, tech companies educate their users about their content moderation policies and the ways in which the policies are enforced. To implement these content moderation processes effectively, tech companies need to make the investments needed to have the requisite capabilities and capacity in their human-led and automated systems. 

In the case of human moderation, tech companies tend to outsource moderation to third parties to manage different dialects within a language. Dozens of reports have shown that in many cases human moderators are inexperienced, untrained, underpaid, and work in inhumane conditions. In Kenya, for example, OpenAI collaborates with “Sama” — a third-party vendor in Nairobi that provides content moderation services for several tech companies across Africa — who pays less than two dollars per hour (USD) for human moderators to evaluate datasets that were later used to train ChatGPT. The decisions made by human moderators, regardless of their accuracy, are used to train automated content moderation systems. 

“Automated” content moderation is the primary method of moderation for a number of social media companies. Automated tools use techniques such as keyword filters, spam detection tools, and hash-matching algorithms. In addition to the data fed by the human moderators, these automated tools are trained on enormous amounts of text data that is available online. Significant investments, amounting to millions of dollars, were spent on the construction of Large Language Models (LLM) that can be used to detect violating content. However, challenges arise due to the lower volume and poorer quality of text data for many languages, creating a “resourcedness gap” that complicates LLM training. Previous research by CDT found that tech companies attempted to build multilingual language models to address this issue, scaling their AI systems to support multiple languages. These multilingual language models are trained on dozens or hundreds of text data from various languages simultaneously. Some of this data for low-resource languages (such as Kiswahili and Burmese) is translated (or mistranslated) or scraped from low-quality sources, leading once again to ineffective automated moderation. 

This project contributes to the existing body of literature on platform governance and accountability, shedding light on the under-explored non-English languages in the Global South. The outputs of this project will be relevant to multiple audiences: policy-makers; companies providing online services that use content moderation and third parties that develop those systems; researchers working on LLMs and related technologies which aim to support content analysis of user-generated content; and end-users of select online services.