Facebook’s Content Moderation Language Barrier

Article In The Thread
Creative Lab / Shutterstock
Sept. 8, 2021

Facebook’s content moderation has been under a magnifying glass as COVID-19 vaccine misinformation has spread throughout the United States. And to avoid triggering the platform’s algorithms, people are using coded language. Anti-vaccine Facebook groups are changing their names to “Dance Party” (40,000 members) or “Dinner Party” (20,000 members) and using the terms “danced” or “drank beer” to mean “vaccinated,” all in order to avoid being banned.

But in other parts of the world, people don’t have to invent codes — they can simply use their own language. Facebook’s human moderators and content-moderation algorithms are much less likely to detect misinformation in languages other than English. To understand the problem, look no further than Romania. There, the Roma people have been the main targets of online attacks, as the Romanian population were falsely accusing the Roma of spreading COVID. Without fixing its language loophole, Facebook risks abetting the persecution of one of the world’s most marginalized ethnic groups.

With around 12 million people living across the continent, the Roma are the largest ethnic minority in Europe. And the most persecuted one too — they have been targets of extreme violence and social exclusion for centuries. Approximately 500,000 Roma people were murdered during the Holocaust, 80 percent of Roma currently live below the poverty line, and in June, a policeman in Czech Republic kneeled on the neck of a Roma man who died shortly after. Activists coined it “Czech Floyd.

While the Roma have suffered offline oppression — segregation, extreme poverty, and lack of access to health services, work, and education — the presence of online discrimination and hate has continued to grow: “Until we are able to gas them like the Nazis, the Roma will infect the nation,” is a common sentiment expressed by Romanians on Facebook.

Roma activist Marian Mandache underlines that the migration of Roma people abroad from Romania has been an accelerator of offline and online hatred, especially because Romanians blame them for making the country “look bad.” The waves of hate regarding the Roma migration in 2007–08 and 2010–11 worried us, because they have already made many victims, Mandache explained. Back then we saw programs implemented by the French and Italian governments for collective deportations and fingerprinting of Roma people.

A recent event shines a light on the complicated space that Facebook occupies in Romania. Since the beginning of 2021, around 2,200 Roma ethnics from Romania have been apprehended crossing the border between Mexico and the United States as they sought political asylum. The news triggered a renewed wave of online abuse toward the Roma people.

Facebook users started posting pictures of crows (a racial slur associated with the Roma in Romania) in the comments of news stories, paired with outrageous racist and fascist speech. “I heard that Texans can have a gun without a permit. They have just started to receive targets for free. They are cheap and they move,” commented a user in Romanian.

“They need to be sent to the concentration camp until the 9th generation.” “Antonescu, I invoke you,” commented others, referencing former Romanian Prime Minister Marshall Antonescu who ordered deportations of Roma people in Transnistria, where most of them died.

Despite going against Facebook’s community standards, all of these hateful comments are still available.

“The truth is that in the pandemic begging and pickpocketing weren’t possible anymore. You can’t really make profit from these activities when people stay home and you have strict rules of social distancing. Probably the government should compensate them for these losses,” jeered one user, to the delight of at least 125 readers.

Since the beginning of the pandemic, police violence towards Roma has increased and often gone viral online, with vocal supporters. “These guys need to be beaten, it’s the only thing they understand,” and “The cops should not be punished, they were doing their jobs,” wrote Facebook users.

Facebook is the most popular social media platform in Romania, where 79 percent of people use Facebook a week. And despite going against Facebook’s community standards, all of these hateful comments are still available on the social network.

Facebook has hate speech detection algorithms in more than 40 languages, such as English, Arabic, and Hindi, which are spoken by hundreds of millions of people. At the time this article was published, Facebook did not respond to my questions regarding hate speech detection algorithms and human moderators existing in Romanian, a language spoken only by some 20 million people.

Only some 20 million people — the relatively small number not only of Romanian speakers, but also Roma, might at least partly explain why anti-Roma sentiments remain so pervasive on Facebook in Romania.

A much smaller group like the Roma community has “far less power” in the international landscape, Spandana Singh, researcher of misinformation and disinformation at New America’s Open Technology Institute, told me. “Traditionally, groups that don’t have as much representation — especially in civil society — have fewer opportunities to engage with companies on a regular basis. As a result, companies are not always sure how to develop meaningful and impactful policies that address the issues that these communities face.”

But the fact that Romania is a small country with a small user base doesn’t mean that hateful content should be left up, adds Singh. “If companies want to operate in a global manner they need to have a responsibility to make sure that all of their users are safe, and not just people who speak the major languages of the world.”

Protecting societies’ most vulnerable groups is not impossible. Facebook hired 100 content moderators who speak Burmese after the breakout of the Rohingya genocide in Myanmar and the escalation of online hate speech towards the Ronhingya people. Facebook’s moderators developed a dataset which was later used by the company to train its algorithms to detect harmful speech. It’s time they took the same approach to Romanian, and other languages with “only” 20 million speakers.

As we finished commemorating the European Holocaust Memorial Day for Sinti and Roma, we still see Facebook comments calling for a repeat of a half million Sinti and Roma deaths, we shouldn’t have to wait for another genocide for Facebook to take action.

You May Also Like

Content Moderation Trends in the MENA Region: Censorship, Discrimination by Design, and Linguistic Challenges (Open Technology Institute, 2021): We spoke to activists, journalists, and members of civil society from the Middle East about how they interact with online content moderation systems, how these experiences have influenced their online behaviors, and what broader trends they see at play.

The Transparency Report Tracking Tool: How Internet Platforms Are Reporting on the Enforcement of Their Content Rules (Open Technology Institute, 2021): Although big tech companies have made strides toward transparency and accountability around their content moderation practices, our tracker shows there's room for improvement.

Digging into Facebook’s Fourth Community Standards Enforcement Report (Open Technology Institute, 2019): Spandana Singh assesses the latest Community Standards Enforcement Report from Facebook, finding progress but also a failure to include some key metrics vital for understanding the scope and scale of the company’s overall content moderation efforts.

Follow The Thread! Subscribe to The Thread monthly newsletter to get the latest in policy, equity, and culture in your inbox the first Tuesday of each month.