Detecting Conspiracy Theories on Social Media: Improving Machine Learning to Detect and Understand Online Conspiracy Theories
This RAND report, conducted for Google Jigsaw, develops a hybrid machine-learning model combining linguistic and rhetorical theory to detect online conspiracy theory language, and substantially outperforms either approach used alone—an architecture the authors argue likely generalizes to other forms of harmful speech. The synthesis around the modeling work surfaces a more uncomfortable finding: conspiracy theories often hook into rhetorically legitimate concerns like health and safety, and many operate by constructing hate-based "us versus them" oppositions, which is precisely why direct contradiction or mockery tends to entrench rather than dislodge belief. The authors recommend engaging transparently and empathetically with conspiracists, correcting false news directly, working with moderate members of conspiracy communities rather than the hardcore, and addressing the underlying fears and existential threats that make the narratives sticky in the first place.