Everything in Moderation

Case Study: Facebook

Out of the three platforms covered by case studies in this report, Facebook has by far the largest content moderation operation. In addition, it is the platform that has come under the most scrutiny for its content moderation decision-making practices, both human and automated. One of the reasons for this is that Facebook is one of the largest social media platforms in the world. It ranks third in global internet engagement after YouTube and Google.com,¹ and the platform has over 2.38 billion monthly active users worldwide.² As a result, Facebook’s content moderation practices affect a significant amount of user expression across the globe. Facebook utilizes both a centralized and hybrid approach to content moderation. In order to provide consistency in how its rules are applied across the world, the company has a global set of Community Standards. These Community Standards are enforced by Facebook’s enormous global pool of human content moderators, who are part of the 30,000 people who work on safety and security for the platform.³ In an attempt to ensure that its policies can be localized and enforced appropriately in different regions and across different contexts, Facebook tries to hire moderators based on their language or regional expertise. All reviewers receive the same general training on the company’s Community Standards and how to enforce them. Some of these moderators later develop specialties with certain sensitive content areas such as self-harm.⁴ Facebook has stated that its rules are structured to reduce bias and subjectivity so that reviewers can make consistent judgements on each case.⁵ In response to growing global pressure from governments and the public to take down violating content quickly, Facebook has invested heavily in automated tools for content moderation. These include image recognition and matching tools to identify and remove objectionable content such as terror-related content; NLP and language matching tools that seek to recognize and learn from patterns in text related to topics such as propaganda and harm; and pattern identification tools, which seek to identify patterns of similar objectionable content on multiple Facebook pages or patterns among individuals who post similar types of objectionable content. The platform has found that pattern detection is most effective for images, such as resized terror propaganda images, rather than text, as text can be more easily manipulated in order to evade detection and removal—and because text requires greater contextual understanding to evaluate.⁶

As part of its hybrid approach to content moderation, Facebook engages in several phases of algorithmic and human review in order to identify, assess, and take action against content that potentially violates its Community Standards. Automated tools are typically the first layer of review when identifying violating content on the platform. Depending on the level of complexity and the degree of additional judgment needed, the content may then be relayed to human moderators.⁷

Facebook deploys automated tools during the ex-ante stage of content moderation. When a user submits content to Facebook, such as a photograph or video, it is immediately screened in an automated process. As described in the section on how automated tools are used in the content moderation process above, this algorithmic screening uses digital hashes to proactively identify and block content that matches existing hash databases for content such as CSAM and terrorism-related imagery.⁸ Facebook also uses proactive match and action tools to detect and remove content that matches some previously identified spam violations. However, it does not screen new posts against every single previously identified spam violation, as this would result in a significant delay between when the user posts the content and when it appears on the website. Rather, this proactive screening process focuses on identifying CSAM and terrorism-related imagery.⁹ Once content has been posted to Facebook, the company engages in ex-post proactive moderation as it employs a different set of algorithms to screen and identify objectionable content. These algorithms assess content to identify similarities to specific patterns found in, for example, images, words, and behaviors that are commonly associated with different types of objectionable content. In its latest report, the Facebook Data Transparency Advisory Group (DTAG), an independent advisory board chartered by Facebook and composed of seven experts from various disciplines, has stated that this process is challenging and limited in that additional context is often required in order to evaluate whether the presence of a certain indicator, such as a specific word, is being used in a violating manner. The algorithms involved in this process also consider other factors related to the post, such as the identity of the poster, the content of the comments, likes, and shares, as well as what is depicted in the rest of an image or video if the content is visual in nature. These elements add context and are used to calculate the likelihood that a piece of content violates the platform’s Community Standards. According to the DTAG report, this list of classifiers is continuously updated and algorithms are retrained to incorporate insights that are acquired as more violating content is identified or is missed. The DTAG report also asserts that if these algorithms determine that the content in question clearly violates a Community Standard, it may remove it automatically without relaying it to a human moderator. However, the report notes that, in cases where the algorithm is uncertain on whether a piece of content violates the platform’s rules, the content is sent to a human moderator for review.¹⁰ The report does not clarify the circumstances in which the company believes an algorithm can make such a definitive determination.Automated tools are also used to triage and prioritize content that is flagged by users during the ex-post reactive portion of content moderation. When a user flags content on the platform, it goes through an automated system that decides how the content should be reviewed. According to the DTAG report, if the system identifies that the content violates the Community Standards, it may be automatically removed. However, as with ex-post proactive moderation, if the algorithm is unsure, the content will be routed to a human moderator.¹¹ If a user flags content before the company is able to identify it, this flag also informs the platform’s machine learning models.¹² In order to audit the accuracy of automated decision-making in content moderation, Facebook calculates two primary metrics: precision and recall. Precision measures the percentage of posts that were correctly labeled as violations out of all the posts that were labeled as violations. Recall measures the percentage of posts that were correctly labeled as violations out of all the posts that were actually violations. Facebook calculates these two metrics separately for each classifier in each algorithm.¹³ However, the DTAG reported that they were unable to acquire details on topics such as specific classifiers, the accuracy of Facebook’s enforcement system, and error and reversal rates, thus limiting the amount of insight the group had into the platform’s algorithmic decision-making processes for content moderation.

Facebook does provide some degree of transparency around how it uses automated tools to proactively identify and remove content and how much user speech this has impacted, in its Community Standards Enforcement Report (CSER). In the CSER, Facebook reports on how much of the objectionable content they removed was identified proactively using their automated tools (“proactivity rate”) in comparison to objectionable content that users reported to Facebook first. Facebook provides this data for nine content categories, including adult nudity and sexual activity, hate speech, terrorist propaganda (focused on ISIS, al-Qaeda, and affiliated groups), and violence and graphic content. Facebook does not provide this data, however, for all of the categories of content that the company has deemed impermissible, and that Facebook moderates, under its Community Standards. These categories include suicide and self-injury.¹⁴ The proactivity rate that Facebook discloses in its CSER can change due to a number of factors, including the fact that Facebook is continuously updating and refining its algorithmic models, as well as the fact that the degree to which content is deemed “likely” to violate the Community Standards varies over time.¹⁵ Although the platform has invested significantly in artificial intelligence and machine learning, its algorithmic decision-making capabilities in terms of content moderation are still limited. For example, despite the fact that Facebook has technology that can detect images, audio, and text that potentially violate the company’s Community Standards on the livestream feature, the Christchurch terrorist was still able to livestream his attack in New Zealand.¹⁶ In addition, the company has faced significant criticism when its automated tools have resulted in the erroneous takedown of user expression. This has included content posted by human rights activists seeking to document atrocities in Syria, which was mislabeled and removed for violating Facebook’s policies on graphic violence.¹⁷

Facebook’s centralized and hybrid approach to content moderation enables the company to deploy a range of tools to moderate content at scale and around the world. However, as demonstrated, the effectiveness of automated tools when identifying and moderating content is limited. As a result, although the platform is investing heavily in new artificial intelligence-driven content moderation tools, it is vital that the company continues adopting a hybrid model of content moderation so that a human moderator is always in the loop to ensure decisions are fair and context-specific. In addition, given that the company is a gatekeeper of a significant amount of user expression, it needs to provide greater transparency and accountability around how it deploys automated tools for content moderation practices and how much user expression this impacts. Although the platform already issues a CSER, it discloses limited information around the role and impact of automated tools in enforcing community standards. This information should be expanded, and the platform should disclose information about how its algorithms are created, trained, tested, and improved.

“It is vital that the company continues adopting a hybrid model of content moderation so that a human moderator is always in the loop to ensure decisions are fair and context-specific.”

Currently, Facebook provides relatively detailed notices to users when their content is removed, offers an appeals process to users who have had certain categories of content removed, and reports on the number of content actions that were appealed and the amount of content restored as a result of appeals in its CSER. However, these processes can be improved in order to provide transparency and accountability around their use of automated tools.¹⁸ For example, in its notice to users, Facebook should specify whether the content removed was flagged and detected by an automated tool, an entity such as an Internet Referral Unit, or a user. In addition, the platform should enable users to provide more context and information during the appeals process, particularly in cases where the content was erroneously flagged or removed by an automated tool.¹⁹ The platform should also work to expand its appeals process to cover the range of objectionable content prohibited by its Community Standards.

Citations

Alexa’s global internet engagement metric is based on the global internet traffic and engagement a platform receives over the past 90 days.
Dan Noyes, "The Top 20 Valuable Facebook Statistics – Updated July 2019," Zephoria Digital Marketing, last modified July 2019, source.
Casey Newton, "Bodies in Seats," The Verge, June 19, 2019, source.
Alexis C. Madrigal, "Inside Facebook's Fast-Growing Content-Moderation Effort," The Atlantic, February 7, 2018, source.
Madrigal, "Inside Facebook's Fast-Growing Content-Moderation Effort".
Under the Hood Session
Under the Hood Session
Klonick, "The New Governors: The People, Rules, and Processes Governing Online Speech".
Ben Bradford et al., Report Of The Facebook Data Transparency Advisory Group, April 2019, source.
Bradford et al., Report Of The Facebook Data Transparency Advisory Group.
Bradford et al., Report Of The Facebook Data Transparency Advisory Group.
Under the Hood Session
Bradford et al., Report Of The Facebook Data Transparency Advisory Group.
Facebook, Community Standards Enforcement Report, 2019, source.
Bradford et al., Report Of The Facebook Data Transparency Advisory Group.
Joseph Cox, "Machine Learning Identifies Weapons in the Christchurch Attack Video. We Know, We Tried It," Motherboard, April 17, 2019, source.
Avi Asher-Schapiro, "YouTube and Facebook Are Removing Evidence of Atrocities, Jeopardizing Cases Against War Criminals," The Intercept, last modified November 2, 2017, source.
Spandana Singh, Assessing YouTube, Facebook and Twitter's Content Takedown Policies: How Internet Platforms Have Adopted the 2018 Santa Clara Principles, May 7, 2019, source
The Santa Clara Principles On Transparency and Accountability in Content Moderation," Santa Clara Principles, last modified May 7, 2018, source.

Education & Work

Democratic Futures

Global Security

Technology & Democracy

Thriving Families

Trending Topics

Real Skills, Real Income: Why Youth Apprenticeship Is Resonating Now

Future-Proofing U.S. Nuclear Policy: Forecasting Outcomes of the Nuclear-Armed Sea-Launched Cruise Missile

Debunking Myths on Student Parent Data Collection

The App Store Accountability Act Poses Serious Concerns for Privacy, Security, and Free Expression

Redrawing School Boundaries for Fairer Funding

Reframing Fusion Voting as a Practical, Powerful Reform Strategy

Harnessing Terrorism Data to Reshape U.S. National Security Policy

Establishing a National Housing Loss Rate

New America Fellows

The Understated Value of Regional Intermediaries for Workforce and Economic Development

Evictions in the District of Columbia: June 2025 – February 2026

The Charleston Regional Youth Apprenticeship Model

Accreditation 101: A Fireside Chat on How Colleges Are Measured

Everything in Moderation

Table of Contents

Case Study: Facebook

Citations

Case Study: Facebook

Everything in Moderation