New ADL Belfer Fellow analysis shows extremists can dodge basic social media filters by not using profanity
New York, NY, January 6, 2022 … Social media content moderation efforts regularly fall short when it comes to detecting white supremacist speech, including discussions of conspiracy theories related to white genocide, Jewish power and malicious grievances toward Jews and people of color, according to a new report from ADL (the Anti-Defamation League).
“This important research demonstrates that despite what they say are their best efforts, social media platforms continue to fail at detecting modern hate speech,” said ADL CEO Jonathan A. Greenblatt. “This is, in part, how the January 6th insurrection was organized in plain sight. There is no such thing as a polite white supremacist, and social media platforms must use their extensive resources to root out the extremist speech slipping past their current filters.”
Using computational methods including machine learning to evaluate language from extremists on an extremist platform, Stormfront, extremists in an alt-right network on Twitter and general users on Reddit, Dr. Libby Hemphill and her team at the University of Michigan School of Information found six key ways that white supremacist speech is distinguishable from commonplace speech:
- White supremacists frequently referenced racial and ethnic groups using plural noun forms (e.g., Jews, whites). Pluralizing group nouns in conjunction with antisemitic content or conspiracy theories, dehumanizes targeted groups, creates artificial distinctions and reinforces group thinking.
- White supremacists appended “white” to otherwise unmarked terms (e.g., power). In doing so, they racialized issues that are not explicitly about race and make whiteness seem at risk.
- White supremacists used less profanity than is common in social media. They claim they are being civil and respectable, and by avoiding profanity they can circumvent simplistic detection based on “offensive” language.
- White supremacists’ posts were congruent on extremist and mainstream platforms, indicating they don’t modify their speech for general audiences or platforms. Their linguistic strategies were similar in public (Reddit and Twitter) and internal (in-group) conversations on extremist sites (Stormfront). These consistent strategies should make white supremacist posts and language more readily identifiable.
- White supremacists’ complaints and messages stayed consistent from year to year. Their particular grievances and bugaboos changed, but their general refrains did not. The consistency of topics, such as conspiracy theories about Jews, “the Great-Replacement theory” and pro-Trump messaging makes them readily identifiable.
- White supremacists racialized Jews; they described Jews in racial rather than religious terms. Their conversations about race and Jews overlapped, but their conversations about church, religion and Jews did not.
These findings further support the need for platforms to remove violent extremist groups and content, including conspiracy theories like QAnon that fueled the Jan. 6 insurrection. ADL experts recommend that platforms use the subtle but detectable differences in white supremacist speech to improve their automated identification methods. Specifically, the platforms should:
- Enforce their own rules. Platforms already prohibit hateful conversations, but they need to improve the enforcement of those policies.
- Use data from extremist sites to create automated detection models. Platforms have used general internet speech to train their detection models, but white supremacist speech is rare enough that current models cannot find this type of speech in the vast sea of internet speech.
- Look for specific linguistic markers (plural noun forms, whiteness). Platforms need to take specific steps when preparing language data to capture these differences.
- De-emphasize profanity in toxicity detection. White supremacists' lack of profanity in their online conversations means platforms need to focus on the message rather than the words.
- Train platform moderators and algorithms to recognize that white supremacists’ conversations are dangerous and hateful. Tech companies need to seriously consider threats to incite violence, attacks on racial groups and attempts to radicalize individuals. Remediations include removing violative content and referring incidents to relevant authorities where appropriate.
“Extremists intentionally seed disinformation and build communities online to normalize their messages and incite violence,” said Hemphill, the report author and an ADL Belfer Fellow. “With all their resources and these revealing findings, platforms should do better. With all their power and influence, platforms must do better.”
The Belfer Fellowship program is possible due to the continued generosity of the Robert Belfer Family. ADL’s Center for Technology and Society works with the fellows as they pursue research in previously unexplored areas. The fellows also augment ADL’s ongoing research efforts to help quantify and qualify online hate in a variety of social media sites, gaming platforms and other fringe online communities. Read more about the Belfer Fellowship and the Center for Technology and Society at: adl.org/CTS.