Developing a fake news identification model with advanced deep language transformers for Turkish COVID-19 misinformation data

Authors: MEHMET BOZUYLA, AKIN ÖZÇİFT

Abstract: The massive use of social media causes rapid information dissemination that amplifies harmful messages such as fake news. Fake-news is misleading information presented as factual news that is generally used to manipulate public opinion. In particular, fake news related to COVID-19 is defined as 'infodemic' by World Health Organization. An infodemic is a misleading information that causes confusion which may harm health. There is a high volume of misinformation about COVID-19 that causes panic and high stress. Therefore, the importance of development of COVID-19 related fake news identification model is clear and it is particularly important for Turkish language from COVID-19 fake news identification point of view. In this article, we propose an advanced deep language transformer model to identify the truth of Turkish COVID-19 news from social media. For this aim, we first generated Turkish COVID-19 news from various sources as a benchmark dataset. Then we utilized five conventional machine learning algorithms (i.e. Naive Bayes, Random Forest, K-Nearest Neighbor, Support Vector Machine, Logistic Regression) on top of several language preprocessing tasks. As a next step, we used novel deep learning algorithms such as Long ShortTerm Memory, Bi-directional Long-Short-Term-Memory, Convolutional Neural Networks, Gated Recurrent Unit and Bi-directional Gated Recurrent Unit. For further evaluation, we made use of deep learning based language transformers, i.e. Bi-directional Encoder Representations from Transformers and its variations, to improve efficiency of the proposed approach. From the obtained results, we observed that neural transformers, in particular Turkish dedicated transformer BerTURK, is able to identify COVID-19 fake news in 98.5% accuracy.

Keywords: Infodemic, fake news, BerTURK, language transformers, machine learning, COVID-19

Full Text: PDF