Abstract
In this investigation, an attempt was made to present a model for distinguishing fake news from truthful news in textual data. For this purpose, using intelligent methods and based on the principles of text analysis, a bi-classification model was presented that divided the textual data into deceptive and truthful classes. Basic algorithms based on artificial intelligence (AI) used for modeling consisting of Adaboost (Ada), Support Vector Classifier (SVC), Random Forest (RF), Neural Network (NN), BERT, and Convolutional Neural Network (ConvNet). Among the methods used in this study are the TF-IDF method for vectorization of textual data; the PCA (Principal Component Analysis) technique for feature transformation; the word2index as well as word embedding models for converting the words into numbers, and the N-gram technique to create a sequence of words. Finally, by conducting a case study and by examining different evaluation indices, the comparison of the offending models was done. The outcomes of this investigation showed that despite the high similarity between the two classes (86.6% similarity in train data and 79.8% similarity in test data), the BERT model had the best result compared to the others. This model has high complexity and can better extract relationships between data. In the basic article, the best value of the Accuracy index was 0.90, which was improved to 0.93 in this study.
Get full access to this article
View all access options for this article.
