Moderating Kiswahili Content on Social Media
Introduction
Africa, a continent with over 2,000 languages and home to more than one-third of the world’s linguistic diversity, has many languages that remain beyond the reach of both automated and human content moderation (Shiundu, 2023). Social media platforms have a limited physical presence in Africa, operating only a few offices and employing minimal staff (De Gregorio & Stremlau, 2023). Despite this, these companies have heavily invested in outsourcing content moderation labor to the continent, hiring vendors to recruit moderators to review content from both Africa and beyond. One of the few African languages benefiting from human moderation is Kiswahili, a language that is spoken by over 100 million people in East and some parts of Central Africa. In this report, we investigate how the content moderation systems of select online platforms deal with user-generated content in Kiswahili.
This report is part of a series that examines content moderation within low-resource and indigenous languages in the Global South. Low-resource describes languages that lack sufficient high-quality training data, making it difficult to develop automated content moderation systems (Nicholas & Bhatia, 2023). In our previous research, we found that content moderation in North Africa, especially in the Maghreb region, suffered from significant biases and systemic inadequacies (Elswah, 2024). We found that content moderation systems for Maghrebi Arabic dialects are impacted by inadequate training data, which fail to capture the rich linguistic diversity of the region. Additionally, content moderators, who work under challenging conditions and are tasked with overseeing content from across the Arab world, face several challenges in making accurate decisions regarding dialects they often do not understand. This results in inaccuracies and inconsistencies in moderation practices, highlighting the urgent need for more inclusive and representative approaches to the moderation of low-resource languages in the Global South.
This report focuses on Kiswahili (also known as Swahili), a language that exists in many varieties in East Africa in Kenya, Tanzania, Uganda, parts of the Democratic Republic of Congo, Burundi, and Rwanda, as well as in some parts of Central Africa (Topan, 2008). This report specifically concentrates on Kenya and Tanzania. We chose Tanzania because it has the largest Kiswahili-speaking population and is the birthplace of Standard Swahili. We selected Kenya because it is home to a significant number of Kiswahili speakers, ranking second only to Tanzania (Dzahene-Quarshie, 2009). Additionally, Kenya is recognized as the “Silicon Savannah” of Africa, which refers to its advanced digital transformation, rapidly increasing internet connectivity, and being host to many companies and institutions involved in the development of digital technologies (Mwaura, 2023; Wahome, 2023).
Using a mixed-method approach that combines an online survey of 143 frequent social media users in Kiswahili and 23 in-depth interviews with content moderators, creators, and digital rights advocates from Kenya and Tanzania, we found that:
- According to our survey, Instagram is the most popular social media platform in Kenya and Tanzania. Additionally, TikTok’s popularity is rapidly growing in East Africa, surpassing that of Facebook.
- The spread of misinformation and hate speech online is a significant issue within the Kiswahili online sphere. The majority of our survey participants expressed concerns about the proliferation of misleading and inciting content on social media platforms.
- Popular social media platforms take three general approaches to Kiswahili content moderation: global, local, and multi-country. The global approach, exemplified by Meta, involves applying uniform policies to all Kiswahili users indiscriminately. Meta requires their Kiswahili moderators to review non-African English-language content from around the world. The local approach, employed by TikTok, tailors the enforcement of some of its policies to account for the diverse cultural contexts within East Africa. However, the variations in the Kiswahili language are overlooked because content moderation vendors hire primarily Kenyan moderators to review content from across East Africa. Many of these moderators may not be familiar with the specific contexts of other East African countries, which can lead to inadequate moderation. Lastly, the multi-country approach utilized by the local Tanzanian platform “JamiiForums” involves hiring native moderators from each Kiswahili-speaking country, who review content generated within their own nations. This ensures that the moderators understand the local context and cultural nuances, allowing them to provide more effective and relevant content moderation for users on JamiiForums.
- Content moderation vendors often downplay the harsh realities of the job by concealing the graphic content that moderators will encounter, avoiding any mention in job advertisements, interviews, and training sessions. Many moderators misunderstand the nature of the role, with some believing they will be content “creators.” Additionally, moderators are exposed to less graphic content during the short period of training, which fails to prepare them for the often distressing content they will encounter in their daily work.
- Much of the content moderation is conducted by third-party outsourced vendors who are contracted by social media platforms and hire moderators on behalf of the platforms. Companies in Nairobi, Kenya that provide Kiswahili content moderation services exclusively hire Kenyans to manage the diverse variations and contexts of Kiswahili content. This leads to many incidents of inaccuracies and inconsistencies in content evaluation.
Download the list of references for this report in BibTeX (.bib) or in .RIS format. These files can be opened in commonly used reference management software.