Document Type : Original Article
Authors
1 Department of Computer Engineering, University of Guilan,Rasht, Iran.
2 Amir Seyed Danesh Faculty of Technology and Engineering, East of Guilan, University of Guilan, Rudsar-Vajargah, Iran.
Abstract
The rapid spread of fake news on digital platforms poses a significant threat to informed public discourse and societal trust. While Large Language Models (LLMs) like BERT have shown remarkable accuracy in automated fake news detection, their opaque nature hinders user trust and understanding. This paper presents a framework that combines the high predictive performance of BERT with post-hoc interpretability techniques to enhance both the effectiveness and transparency of fake news detection systems. Specifically, we fine-tune BERT for binary fake news classification on the COVID-19 Fake News Dataset and employ Local Interpretable Model-agnostic Explanations (LIME) and BERT attention visualization to elucidate the model's decision-making process. Our results demonstrate that the fine-tuned BERT model achieves excellent performance, with an accuracy of 97.66% and an F1-score of 97.49% on the test set. Furthermore, LIME explanations highlight the contribution of specific words to individual predictions, while attention visualizations reveal which token relationships the model deems important. This integrated approach underscores that "truth" in machine prediction encompasses not only high accuracy but also explainability, thereby fostering greater confidence in automated fake news detection systems.
Keywords