Detecting Mis-Information with XAI


Combating fake news and misinformation propagation is a challenging task in the post-truth era. News feed and search algorithms could potentially lead to unintentional large-scale propagation of false and fabricated information with users being exposed to algorithmically selected false content. We design a news reviewing and sharing interface, create a dataset of news stories, and train four interpretable fake news detection algorithms to study the effects of algorithmic transparency on end-users.

Intended Use

This contribution is intended to deeply understand some deployed NLP models for misinformation detection tasks. Besides, with the aid of this contribution, it is possible to further study the interactions between user engagement, mental model, trust, and performance measures in the process of explaining.

This contribution can generally be applied to the natural language processing domains, especially for the classification settings.


Model inputs: textual statement/claim for news stories;

Model outputs: confidence score for mis-information prediction + various interpretation outcomes (with XAI assistants);

Data for training: labelled news claims from snopes + unlabelled related news articles from Google search


The quality of the collected news claims is limited, considering the fact that some news labels are time-sensitive. To have a more reliable XAI system, we need trustworthy labelled datasets for training.

No significant failures observed.


Mohseni, S., Yang, F., Pentyala, S., Du, M., Liu, Y., Lupfer, N., Hu, X., Ji, S., & Ragan, E. (2021). Machine Learning Explanations to Prevent Overtrust in Fake News Detection. Proceedings of the International AAAI Conference on Web and Social Media, 15(1), 421-431.