Skip to Content

Free Expression

Context Before Code: Meta’s Oversight Board Policy Advisory Opinion on the Word “Shaheed” Calls for Language and Cultural Nuance in Content Moderation

“Shaheed” شهيد is one of the names of Allah in Islam and a term used across languages including Arabic, Urdu, Hindi, Farsi, and Turkish. “Shaheed” carries multiple meanings and has many variations, including “witness” or “martyr.” It also appears in the Quran over thirty times and is used by the news media to refer to victims of political assassinations. According to the long-awaited Policy Advisory Opinion published by the Meta Oversight Board last week, it is also the most moderated word on Meta’s platforms.

The Board’s opinion tackles Meta’s moderation of “shaheed” in reference to Meta’s Dangerous Organization and Individuals (DOI) list, which has been previously criticized for over-representing Arabs and Muslims. Meta removes content praising any of the entities or individuals on the list, and its policies reportedly result in suppressing neutral conversations or informational discussions involving any of those listed. After nearly one year (delayed in part by the October 7 attacks in Israel and the ongoing war in Gaza), the Oversight Board concludes that a poor transliteration of “shaheed” from Arabic to English and a set of opaque internal processes has led to a “blanket ban” of the term across services without disclosure to Arabic speakers who use the platform. That ban has resulted in an overbroad, substantial, and disproportionate restriction of an entire community’s freedom of expression, exacerbating distrust and reinforcing biased perceptions.

The Board makes a number of key recommendations for Meta, articulating the need for more clarity and transparency around the platform’s DOI policy, enforcement of that policy, and the use of automated content moderation tools. The Opinion lays out steps for Meta to reverse the ban on the term “shaheed”, update internal guidance provided to moderation teams to make clear that the use of “shaheed” does not violate the company’s policy unless it is also accompanied by signals of violence and still may be important to keep up due to the use of the term in reporting, and to periodically audit the company’s DOI policies and channels. The Opinion also calls for greater transparency to users, which will enable them to know what is and is not permitted, to make more informed decisions about how to use the platform, and to assert their rights should they want to appeal a decision. The Board’s decision, however, is not binding or immediately enforced. Meta has three months to respond.

This decision is in line with the globally-developed Santa Clara Principles, which have long called on companies to take cultural and linguistic context into consideration when developing and enforcing content moderation policies. Comments submitted by over 100 organizations and individuals, including many Arabic-language experts, as part of the development of this Policy Advisory Opinion make clear that there is no direct correlation between all uses of the term “shaheed” and real-world harm. The findings lay bare the company’s incomplete understanding of the ways in which the Arabic-speaking community uses the term. This Opinion will serve as an important milestone if its effect is used to spur greater engagement with language speakers and experts around the world.

The Opinion lays out steps for the company to explain how classifiers and automated tools generate predictions of policy violations and asks the company to assess its classifiers. Such tools are likely to have more shortcomings in languages other than English, particularly in low-resource languages. Low-resource languages are those languages that have fewer high-quality digitized examples of text to train and evaluate models. While Arabic is a medium-resourced language, some dialects of the polyglossic language such as the Arabic spoken in the Maghreb are considered lower-resourced, a dialect we are studying as part of a CDT research project on content moderation systems in the Global South.

Academics have also argued that the Arabic training data used to train automated systems is not representative of how Arabic speakers use the language. For example, if Arabic language training data is machine translated or only depicts the use of the term “shaheed” in contexts of violence, then the model will develop an incomplete understanding of a word used in expansive and multifaceted ways. That, in turn, will lead to Arabic speakers being routinely and disproportionately subject to inconsistent moderation.

Further, without adequate benchmarks across languages, we will never know whether these automated content moderation tools work equitably. Currently, large tech companies tout the performance of their multilingual tools yet disclose in fine print that they are tested on entirely American English prompts or machine-translated ones. (Meta’s XLM-R, for example, is a model the company claims can moderate content in more than 160 languages, but the model is only tested on English and machine-translated benchmarks in a handful of languages.) Both of these are inadequate proxies for gauging multilingual capability especially in context-specific tasks such as content moderation.

CDT has called on companies to invest in digitizing low resource languages, to create evaluation benchmarks for languages other than English, and to work with NLP communities that are experts in language and cultural contexts to do so. Future research and guidance offered by the Board should look into how and whether Meta is taking these steps, including in Arabic across dialects. The Advisory Opinion should also be a call to others in the ecosystem to promote a virtuous cycle of investment in curating datasets of low resource languages, training language models, and building tools.