Detecting and Mitigating the Dissemination of Fake News: Challenges and Future Research Opportunities

Fake news is a major threat to democracy (e.g., influencing public opinion), and its impact cannot be understated particularly in our current socially and digitally connected society. Researchers from different disciplines (e.g., computer science, political science, information science, and linguistics) have also studied the dissemination, detection, and mitigation of fake news; however, it remains challenging to detect and prevent the dissemination of fake news in practice. In addition, we emphasize the importance of designing artificial intelligence (AI)-powered systems that are capable of providing detailed, yet user-friendly, explanations of the classification / detection of fake news. Hence, in this article, we systematically survey existing state-of-the-art approaches designed to detect and mitigate the dissemination of fake news, and based on the analysis, we discuss several key challenges and present a potential future research agenda, especially incorporating AI explainable fake news credibility system.


I. INTRODUCTION
N UMEROUS solutions [1]- [4] have been proposed to address several security-and privacy-related issues whether it is related to the Internet of Things (IoT) [5]- [8], user-authentication issues [9]- [11], improving road traffic safety [12], or other cyber threats [13]- [15].However, as individuals and organizations become more connected and are used to getting information (e.g., news) in real time and from different sources (e.g., user-generated content), there is a new risk that such sources and dissemination approaches (e.g., social media platforms) can be abused via fake news [16].The research in the domain of fake news is still in its infancy.We broadly define fake news to be content that is created to mislead users [17], for example, to misguide, cheat, and defame individuals, groups of individuals, organizations, and/or governments.
Fake news can impact our society in different ways [18]- [20].A research analysis of economic data across different sectors was conducted by CHEQ, a University of Baltimore economist, artificial intelligence (AI), and cybersecurity company, to determine the annual monetary damage by websites that propagate false information. 1The findings are summarized in Fig. 1.For example, the study found that the economic impact of fake news at the global level is approximately $78 billion, with a direct economic loss of around $39 billion a year in the stock market [21].Recent high profile real-world events that were affected by fake news include COVID-19 vaccine [22]- [29] and the 2016 U.S. Presidential Election [30], [31].These events reinforce the importance of designing techniques and approaches to detect and mitigate the dissemination of fake news [17], [32], [33].

TABLE I CONTRIBUTION OF EXISTING REVIEW PAPERS
There are, however, a number of challenges in classifying whether the content is genuine or fabricated as a news article may contain certain truths and untruths, for example, an article alleging that a particular candidate is corrupted because he/she has committed activities A, B, and C.However, this particular candidate may indeed be corrupted because he/she has committed activities B, D, and E, rather than activities A, B, and C. Hence, should we classify this news article as authentic or fake?
An example life cycle of fake news is presented in Fig. 2, which explains how fake news can be generated and propagated through various platforms.Motivations for such generation and dissemination vary, ranging from financial (e.g., monetary) to political (e.g., left/right wing) to terrorism (e.g., seeking to create societal panic/unrest) to defamation and so on.Fake news can also exist in different formats, such as text, image, audio, and video.
There have been extensive attempts to design solutions to detect and mitigate the dissemination of fake news, as well as the existence of several popular fact checking websites (e.g., PolitiFact and TruthorFiction).Building on existing literature review and/or survey articles (see Table I), we survey recently published articles in the literature on this topic.
Based on our analysis of the articles, we then present a new taxonomy of fake news detection approaches to guide the categorization of existing approaches.Finally, we identify and discuss existing and emerging challenges, as well as potential research agendas.
This remaining article is organized as follows.In Section II, we explain our survey methodology and discuss existing approaches.In Section III, the challenges and issues in detecting fake news are discussed.Section IV discusses potential future directions, prior to us concluding this article in Section V.

II. EXISTING FAKE NEWS DETECTION APPROACHES
Due to the extensive volume of literature on this topic, we focused only on SCI-indexed technical journal articles from the year 2019 (for instance, we excluded conference papers and book chapters).Here, we briefly summarize the most recent works on fake news detection into seven detection categories, as shown in Fig. 3.We have described the datasets used in these studies in Tables III-IX.
The classification is done in four layers.In the first layer, the studies are sorted based on the focus of the research.Each color code represents one detection-based research focus, and based on that, we have divided the work of researchers based on the type of fake news content in layer 2, fake news features in layer 3, and dataset categories in layer 4. Based on our study, the researchers have also put much emphasis on feature identification in detecting fake news.
Features have played the most important role in models specifically in fake news detection as real and fake classes have very similar characteristics.Table II aims to represent various aspects of features studied in this survey.We can look at the features from the mentioned points of view.
In the following, the summaries of the existing works are briefly highlighted.

A. Automatic Detection
Studies proposing models that automatically capture discriminatory features from fake news and classify the news based on the hidden layers of the deep learning model comes under the category of automatic detection.
Ozbay and Alatas [16] used a two-step method for fake news detection based on converting unstructured data into a structured dataset and then implementing 23 supervised AI algorithms on the structured dataset by text mining methods.
To detect fake news, Kaliyar et al. [39] used a deep convolutional neural network (FNDNet).The proposed model is designed to automatically detect the features of fake news and differentiate them from those of real news.Moreover, the study leverages binary class datasets instead of using a hybrid approach.The hybrid approach detects the fake or real news based on the combination of the news' content, its context, and temporal-level information and can create more impact when using multilabel datasets.In this way, the use of binary class dataset results in limiting the analysis and generalization of results obtained from a broader perspective.

B. Language-Specific Detection
It defines all those papers dealing in languages other than or more than just English while incorporating multiple language datasets to test their detection model.
Faustini and Covões et al. [40] proposed a text-featurebased, language-independent methodology for fake news detection.The generated text is independent of the source platform and language.The proposed methodology was evaluated on five datasets in three language groups giving satisfactory results in comparison to the benchmark.Four algorithms are applied in each set i.e., k-nearest neighbors (KNN), random forest, Gaussian Naïve Bayes, and support vector machines (SVMs).Out of these four, the best results were obtained by applying random forest and SVM algorithms.Random forest gave the highest F1 score of 95% and attained the highest accuracy of 95%.

C. Dataset-Based Detection
Studies that contribute majorly toward presenting datasets that can be widely used by other researchers to test their proposed detection model come under the category of datasetbased detection.While other researchers use only publicly identified datasets, these studies have made their own dataset to detect fake news as part of their contribution.
Neves et al. [41] introduced a novel approach (generative adversarial network (GAN)-fingerprint Removal autoencoder-GANprintR) to spoof facial manipulation detection systems.This approach removes the GAN fingerprints without compromising the image quality.Thus, machines, such as humans, will not be able to distinguish fake images from real ones.GANprintR is trained with face images of real persons instead of synthetic face images with GAN-fingerprint.To carry out this study, three state-of-the-art manipulation detection approaches are used: XceptionNet [42], Steganaly-   [43], and Local Artifacts [44].Also, three different scenarios are designed: 1) controlled scenarios; 2) in-thewild scenarios; and 3) GAN-fingerprint removal.From this study, it can be concluded that as long as facial manipulation detection systems show poor performance, it is unfeasible to detect fake news from real ones by comparing their images.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Contents with biased messages, intentional or unintentional, are defined as propaganda [45].In [37], a framework is proposed for propaganda Spotting in Online Urdu Language (ProSOUL).Due to the lack of dataset and LIWC dictionary in the Urdu language, authors developed a labeled dataset via translating the English dataset of QCRI' s propaganda (Qprop) and also an English LIWC dictionary into the Urdu language.The main contribution of this article is analyzing several different feature sets; thus, following preprocessing steps, they extract NEws LAndscape (NELA) features to analyze news articles from stylistic and psycholinguistic perspectives.
To address the issue of fake news detection in languages other than English, Silva et al. [46] formulated a semiautomated corpus named "Fake.Br" to overcome this barrier of language.This study answers the undiscussed areas of fake news detection like what set of features must be included for automatic fake news detection and the best classification strategies for this purpose.

D. Early Detection
The early detection category contains all those studies that focus primarily on the early detection and propagation of fake news and to efficiently learn by means of the classifier when the news is first posted on the Internet.
Wang et al. [47] presented SemSeq4FD, which is a graph-based neural network model for the task of fake news detection.The model is designed to detect fake news early and is based on enhanced text representation.
Zhou et al. [48] leveraged established social and psychological theories and supervised classification for the automatic detection of fake news.The model was developed for the early detection of fake news to prevent its propagation on social media platforms.However, as in many previous studies, the experimental analysis was limited to only text-based news articles.
Liu and Wu [49] used deep learning for the early detection of fake news.The major components of the model include a feature extractor, a convolutional neural network (CNN)based news classifier, and a positive unlabeled (PU) learning framework.The datasets used in this study described in [50] and [51] were generated from Twitter and Weibo.
Zubiaga and Jiang [52] proposed a method for hoax detection in social media by using a free collaborative knowledge base and class-specific word embedding.A logistic regression classifier was used to measure the effectiveness of the class-specific word representations.Zhao et al. [38] applied a network analysis method to understand the distinguishing features between real news and fake news based on their propagation network.The main research goal was to study the evolution and topology of network propagation in different settings.The experimental analysis focused on the network with the largest component and topological features such as the shape of a network.

E. Stance Detection
In stance detection, we have categorized studies that have reflected on the phenomenon of understanding the fake news while checking the stance of all news reporters that are reporting about the same incident in other mediums and utilizing the stance to build a stance detector.
In order to effectively detect, describe, and model fake news spread on online social media [53], Xu et al. [54] proposed a framework that characterizes the Web sites and reputations of the fake and real news publishers and analyzes the similarity between the real and fake news through TF-IDF and latent Dirichlet allocation (LDA) topic modeling, and also with the help of Jaccard similarity measures, document similarity between real, fake, and hybrid news articles is analyzed.The experiments are conducted on the BuzzFeed News dataset [55].
Umer et al. [56] proposed a stance detection-based approach to detect fake news.They have focused on determining the relative relevance between the headline and the news body based on four stance labels i.e., either agree, disagree, unrelated, or discuss.

F. Feature-Based Detection
Almost all reviewed studies focus on extracting the features of fake news and then utilize them to train their classifier; Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.however, some the researchers have worked on novel and newer feature extraction such as topological features and semantic features as a major contribution to their research.It has helped these researchers make a robust detection model, and these studies have been sorted under the category of feature-based detection where, as a major part of contribution, the authors have presented various features for detection.
De Oliveira et al. [57] conducted a stylistic-computational analysis, which is based on natural language processing (NLP) and used a one-class SVM technique for detecting fake news on social media.The dataset used in this study contains 33 000 tweets.Methods discussed in this article can be beneficial for early detection as the required features for the detection are available as soon as a news is released.[58] developed a fake model that label textual content as "real" or "fake" according to its truthfulness score.The study uses Grover,2 a model for neural fake news generation and detection.A similar setup described in [59] was used in this work.The goal of the setup is to generate a large fake corpus using a language model so that the classifier can classify it as real.
Li et al. [60] leveraged deep learning by proposing a multilevel convolutional neural network (MCNN) and a method of calculating the weight of sensitive words for fake news detection in textual news articles.The method provided deeper semantic analysis and understanding of news article text and its veracity through the relationship between the article text content and the corresponding weight of sensitive words it invokes.The study utilizes the following benchmark fake news datasets: 1) LIAR, compiled by Wang [61]; 2) microblog datasets from Twitter and Sina Weibo compiled by Ma et al. [50], [51], respectively; 3) NewsFN3 ; and 4) KaggleFN. 4

G. Ensemble-Based Detection
Ensemble learning-based detection contains all papers that have utilized the approach of ensemble learning model in their detection to achieve better results.
Elhadad et al. [62] developed a model for detecting misleading information related to the COVID-19 pandemic in the English language by applying an ensembled learning approach, which combines different machine learning classifiers.In this study, a technique is introduced to collect ground truth from credible and unbiased information sources.
Huang and Chen [63] used a deep learning model for fake news detection, used a combination of four deep learning techniques, namely, LSTM, depth LSTM, LIWC CNN, and N-gram CNN, and formulated an ensemble learning model.Furthermore, they applied the self-adaptive harmony search (SAHS) algorithm to optimize model weights and attained accuracy as high as 99.4%.Similarly, Hakak et al. [64] proposed another similar ensemble-based approach to detect fake news.
Table X comprehensively shows machine/deep learningbased methods working on ensemble approach that achieved a score of above 90% in their evaluation metrics.The highlighted cells show the highest results achieved by the researchers.
Fig. 4 shows all the methods used by the studies chosen for this survey categorized into three main learning techniques: deep learning, machine learning, and NLP.It can be seen that the researchers have spent a significant time in machine learning models and have now started to explore deep learning models to build a robust fake news detector.

III. OPEN RESEARCH CHALLENGES
During the literature review, we found that the researchers encountered many challenges out of which we selected the most common and critical ones that need to be addressed in order to make an efficient classifier.Table XI summarizes the important research challenges found during the literature study along with their causes.

A. Variable Length of News Content
A recent challenge for fake news detection is regarding the size of news.Generally, the size of true news is larger than that of fake news.To analyze the effect of length of text on the performance of model, this survey studies both full text and truncated text and the results showed that full text achieved comparatively lower results.A major research challenge lies in this regard.The truncated text proves to be an effective way of finding fake news in real-world scenarios as texts can be easily tricked by increasing the length of fake news.A good Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

B. Long-Distance Document Dependence
Sentence and document level-based methods perform better for the task of fake news detection, but these methods cannot be applied for long-distance document dependence.Sentences that are related to semantics have the possibility of not being positioned close in the document.As the document contains long-distance-dependent structures, the model hardly captures the semantic content available in the document.Besides, the algorithms used for this purpose are prone to overfitting and do not have satisfactory generalizing ability.Since the fake news spread on various platforms and through different sources of network with a wide variation of languages and domains, the models must be robust enough to perform efficiently without depending on a particular domain or source.The overfitting problem makes the existing algorithms unfit to be applied on news from diverse sources and domains.

C. Manipulation of Deep Fake Algorithms in Images and Videos
Most previous fake news research studies were focused on news in text formats using traditional supervised learning algorithms.However, fake news is widely spread in image and video formats [65].Deep fake algorithms can create almost real fake images and videos that cannot be comprehended by a human brain as authentic and hence can take the Internet by storm by putting deep fakes to serve a propaganda [66].This is mainly due to the evolution of free and easy-to-use tools for manipulating images and videos that are available in numerous quantities [67].Therefore, relying solely on text or features based on user network and activity may not apply to many real-life scenarios.Images or videos embedded in fake news posts can either be maliciously tampered or real images that are misleading.Interestingly, fake news images and videos often have various distinguishing characteristics from real-news images at both physical and semantic levels.A major research challenge in this direction is to develop effective methods for feature representation and extraction in multimedia fake news datasets [68].

D. Encrypted Messages
The automatic detection of fake news in encrypted social media platforms, such as WhatsApp, Instagram, WeChat, and Telegram, is very challenging [65].For instance, WhatsApp is organized in such a way that its contents circulate in specific communication circles making it difficult to be tracked due to the end-to-end encryption.This ensures that only the users involved in the conversation have access to the content shared, shielding abusive content from being detected or removed [69].While this feature protects data from being read or secretly modified by a third party, it becomes an obstacle to fake news moderation.An interesting research opportunity relevant to this challenge is to develop effective methods for fake news detection while ensuring that encrypted messages are kept secured.

E. Training Data Limitations
A large-scale dataset consisting of the ground truth of both fake and real news is vital in understanding relationships among different types of fake news, and however, such datasets are so far rarely available.The datasets used in many previous fake news detection models are limited with regard to features and data size.Some models, such as the one in [54], rely on the publisher's domain or user account reputation score.Other models such as the one in [60] were built on relatively small data size that can hardly generalize well to real-world scenarios.The effect of dataset size is even more pronounced in deep learning models.Also, with the rise in the use of bots and cloned accounts that closely mimic real users on social media, domain reputation might not be a good feature for detecting fake news, even highly reputable Web sites may publish or share fake news for a variety of reasons.

F. Drawbacks of the Manual Techniques of Dataset Construction
In developing a real-time model, the most significant challenge is the scarcity of sufficient and proper data.Previous studies have shown that each domain needs its relevant dataset and the same one cannot be applied to all domains, for example, political news dataset cannot be implemented on COVID-19-related fake news detection.This is the case for each language-based dataset as well [37], [70], [71].On the other hand, the manual approach of creating datasets and verifying by experts is time-consuming, expensive, biased, and probably subjective and tends to be impractical because of the huge volume of available data on the Internet [72]- [74].Furthermore, even the best datasets need to be updated regularly to remain effective for training an accurate model.

G. Shortage of Discriminating Features
Another challenge in the area of fake news detection is the shortage of discriminatory features.As we know, fake news producers can imitate real-news writers and this makes it difficult to extract enough suitable features to distinguish both classes.In addition, most of the efficient characteristics to differentiate fake news from real ones are not available at the early stages of releasing the news; hence, usually, early detection fails because of the lack of such types of features that can conveniently classify a fake news from real news.Also, some features are available solely in some platforms and cannot be utilized in others such as user-based features that are specifically utilized in Facebook and other social media.

H. Identification of News Bots (Cyborgs)
There are several types of automated accounts that have slightly different characteristics and are applied for various applications.While there is a possibility that some users who are real will be sharing something only occasionally, whereas the bots will obviously be storming the Internet with hundreds and thousands of tweets that are often just linked to only one subject.The content is often liked to be reposted instead of being fresh and authentic.However, a kind of hybrid account that is a combination of a bot's tirelessness with human subtlety is known as cyborgs [75].Cyborg accounts are practically those in which a human occasionally and periodically takes over the bot account to respond to queries and comments of other users and to post some new authentic content.They are not only more expensive and time consuming to work but also better hiders as robots. 5Operators of such accounts also are often called "cyborgs" because of this mix of automated and "human" posts.The activators often use the social media management platform Hootsuite, to control many accounts simultaneously. 6The challenge applies across all sorts of platforms, including and just not limited to dating apps, Twitter, online therapy services, as well as Facebook. 7Many studies have investigated fake account detection problems.For instance, Cresci et al. [76] proposed a simple and effective approach in which they extract digital DNA sequences from users' online actions to model their behavior.Also, Kosmajac and Keselj [77] used a similar technique and improved it to cover many more features of a tweet by a character label.In these studies, according to the lifestyle of user accounts, their fingerprints are produced.With the help of this simple method, they have managed to show that automated accounts have less diverse behavior than genuine user accounts.Even though Cyborgs are a sophisticated kind of bots and have a very similar behavior to real users, utilizing such methods along with other feature sets can help in detecting them.

IV. FUTURE RESEARCH DIRECTIONS
From our survey and the abovementioned challenges, we conclude that fake news detection can be further improved by working on the following areas.

A. Blockchain Fake News Detection
Blockchain technology, with its transparent and traceability, can now be a game turner in the world of fake news detection based on the fact that it permanently records any transaction being held at any date and cannot be manipulated to portray any wrong transaction over time [78]- [80].Using the multiple advantages of Blockchain's peer-to-peer network concepts, a way to find and detect fake news on social media can be helpful.It is now possible to not only verify that the information is genuine but also check its sources and easily build the most needed trust in the news that are being displayed on the Internet [81].The presented scenario of blockchain for fake news detection is based on proof of authority (POA) with a high transaction rate.The publication of news is dependent on the status of credibility score.In this stage, some nodes become validators to check the truth of news and validate the transaction.The validator pinpoints if the news is fake or real and how much it is fake.The transaction hash gives flexibility to blockchain for publishing the news.If the condition is not met, then the news published is considered as fake news [82].

B. Deep Fake Detection
The detection of deep fakes is an interesting area for future research [83], [84].Visual forensics features are normally used for the manipulation detection of image and video detection.However, most forensics features are created manually just for detecting specified manipulated traces and are not applicable for real images attached to fake news.In addition, these handcrafted features are labor-intensive and limited in terms of learning complicated patterns leading to poor generalization performance on the task of fake news detection [68].Deep learning methods for computer vision are a promising approach for detecting manipulated and misleading contents in multimedia data because of the recent successes and growing popularity of transfer learning and the use of open-source pretrained models for boosting detection performance even on a small dataset [65].

C. Synthetic Training Data Generation for Fake News Detection
When working with machine learning, especially deep learning, it is important to have a high-quality dataset to train the algorithm.This means that the data should not only be sufficiently large, to cover as many cases as possible, but also be a good representation of the reality.Synthetic datasets can also be generated to augment existing datasets and improve the prediction results.The creation of synthetic data can be useful for several reasons, including the goal of avoiding using the original data for privacy reasons and oversampling minority classes when learning from imbalanced data, a common problem in cybersecurity.Synthetic datasets can also be generated to augment existing datasets and improve the prediction results.An open research area is to develop easy-to-use techniques for generating synthetic datasets of various formats, including images and videos that closely resemble the target data.GAN8 is an innovative technology that can be leveraged to generate synthetic data in various formats for fake news detection.

D. User Profile-Based Feature
One of the most important features to be considered in detection of fake news is the user profile-based feature [85].This will give a prior knowledge about the profiles spreading fake news around the medium [86] (e.g., number of posts, account age, and the followers' number) and any of their posts will be taken as suspicious.It is obvious that while considering the user-based features, approachability is the most important factor to be considered.Due to privacy constraints, data of users are not readily available and user interactions are protected [87], [88].Previous studies have shown that bots have been seen in propagation of fake news over the Internet, so a bot detection technique that will distinguish a bot account from that of a normal user can be really helpful in better exploitation of user profile features in fake news detection [89].

E. Multimodel Approach
A very serious shortcoming of previous research for identification of fake news is that they are unable to learn a feature representation of information such as multimodal (textual+visual) [90].There have been methods focusing on multimodal detection, but they employ an additional subtask to try to discriminate events and find correlations across these modalities.The outcomes are not only heavily dependent on these subtasks but also in their absence can reduce the performance of the detector by up to 10% [90].So for, future research can be the inclusion of multimodal approach using a combination of different learning techniques utilizing the strength of each to combine to form an efficient and robust fake news detector for textual plus visual-based information.Furthermore, this approach will aid in solving the problem of multiclass fake news detection.

F. New and Efficient Methods for Dataset Construction
Developing semiautomated techniques for constructing datasets and gathering data from reliable sources and credible knowledge bases can be considered as a productive and exciting field of study.For instance, in [62], by using ground truth obtained from credible and unbiased information sources, a model was developed for COVID-19, which can be applied to other topics, domains, and platforms as well as in other languages as suggested by the authors to evaluate its efficiency thoroughly.

G. Extracting New Discriminatory Features
As we know for developing accurate models, in addition to clean and exact data and strong classifiers, we need discriminating features, and as discussed in [46], there is not yet a consensus on what combination of feature sets is optimal for early detection of fake news or at least for specific conditions.Even though one of the benefits of deep learning techniques is to eliminate the need of manual feature extraction, training such models with proper features as opposed to all unprocessed data might have a positive impact on the performance as can be seen in [56].Thus, extracting and analyzing new features that are capable to show different aspects of news can be considered as a field that needs to be studied further.For instance, in [63], only the sentence depths of news articles are considered as a new discriminating feature, while more helpful characteristics can be extracted in the patterns of sentences of an article through analyzing grammar and also all 130 NELA features.

H. Detection of Online Religious Content
Social media platforms are often bombarded with fake religious content [91]- [93] to create panic among the people.There have been numerous attempts to alter the religious content [94]- [97] and spread the fake information [98] by sharing tampered verses through different media outlets.Hence, detecting the fake religious content, which can be in different languages such as Arabic [99], Portuguese [100], Persian, and Chinese, is one of the future research directions that needs to be addressed.

I. AI Explainable Multimodal Credibility Analysis Systems
With the recent advancement in social media systems, the line between fake news and that of facts have blurred.A very good research direction would be in the field of AI explainable credibility analysis system for fake news on social media.A manual censor board is not efficient enough nor capable of handling news on large scale to detect and explain why it is fake news [101].One such automated framework exists for health blogs that give credibility score for author, text, and image of the blog and also provide user-friendly explanation.These framework systems can be further extended toward other domains such as political agenda, education, and health misinformation and should take into perspective the credibility of the author's claims as well to detect misinformation [102].

V. CONCLUSION
With the development in digital world, the online fake content is increasing drastically.Being in common people's access, such fake widespread content can cause potential set-back to journalism and democracy by misleading people.The decisions and opinions of the public are greatly influenced by false content as they spread much quicker and leave a great impact.Such contents gain popularity because most of the users on social media are unaware about certain topics and are easily deluded by fake content.Other reasons may include people's reliance on online media platforms and the catchy draft of fake news.With ever-increasing fake content, the research on the fake news detection is also making progress and various approaches from vast domains have been implemented ranging from AI to linguistic and knowledge engineering, but no ideal methodology has been devised yet that can accurately classify real news and fake news.The major challenge in this task is the growing social media content, which is increasing exponentially daily.The social sites and apps, such as Facebook, Twitter, Instagram, and other blogs, have enabled people to put any random unchecked content including their personal opinions and thoughts that has become quite challenging in devising a fake news detection algorithm.In this article, a thorough study on existing fake news detection techniques along with a new taxonomy is included and major challenges of fake news detection have been discussed with future recommendations to further improve this area.

TABLE II DIFFERENT
ASPECTS OF FEATURES

TABLE VI QUALITATIVE
ANALYSIS OF DATASETS IN EARLY DETECTION TECHNIQUES

TABLE VII QUALITATIVE
ANALYSIS OF DATASETS IN STANCE DETECTION STUDIES TABLE VIII QUALITATIVE ANALYSIS OF DATASETS IN FEATURE-BASED DETECTION TECHNIQUES

TABLE IX QUALITATIVE
ANALYSIS OF DATASETS IN ENSEMBLE LEARNING-BASED DETECTION TECHNIQUES

TABLE X EVALUATION
METRICS FOR BEST PERFORMING METHOD IN THE AFOREMENTIONED STUDIES Schuster et al.