Introduction

In recent years, fake news has become an increasing concern for many, and for good reason. Newspapers, which we once trusted to deliver credible news through accountable journalists, are vanishing en masse along with their writers. Updates about events and happenings around the world spread faster through social media than journalists can report, and the Internet’s nature is that anyone can post anything. This becomes especially worrisome when coupled with studies that show false news spreads six times faster than real news. The recent rise of large language models (LLMs) further exacerbates this issue. LLMs are growing and getting better at mimicking human speech. Openly-available models like ChatGPT provide easy access for malicious actors to produce fake news on large scales. While we hope we can easily detect machine-generated fake news, a study by Huang et al. (2023) revealed that humans could only detect ChatGPT-generated fake news in 54.8% of cases.

In this article, we briefly discuss the current role LLMs have in the news landscape and how LLMs like ChatGPT can be utilised for fake news generation. We will then review several proposed methods to detect fake news using LLMs, which enhance their performances by using reason-aware prompting, retrieval augmentation, and graphs.

LLMs For Generating News

LLMs have been praised for their remarkable natural language understanding and generation abilities. News sites powered with LLMs such as Realtime are already a reality. However, LLMs are not yet fully capable of replacing writers and journalists. Specifically in the domain of news reporting, LLMs lack the critical thinking and investigative skills that journalists have. Their knowledge is based on outdated data, and constantly retraining LLMs on up-to-date facts is a costly, time-consuming, and impractical process. Moreover, LLMs struggle with long-tail knowledge, which refers to facts that appear very rarely and are hence not effectively learnt by the LLM. In the face of these issues, LLMs can hallucinate. They may make up information that sounds convincing and misleads unsuspecting readers. While useful as tools to summarise, draft, and enhance articles, LLMs are not at a stage where they can replace writers. In fact, while LLMs have been increasingly used to generate articles, mainstream, and reliable news outlets primarily only employ them to report on financial and business-related news.

In the generation of fake news, however, the factual accuracy of content is not important. Instead, its sensationalism is what drives the success of a piece of news. LLMs excel in this area since they have shown their capabilities in creative writing. Although LLMs like ChatGPT are meant to have moderation mechanisms and reinforcement learning from human feedback (RLHF) to prevent the generation of harmful content, you could still ask ChatGPT or Google’s Gemini to write news articles about made-up topics. If the LLM refuses to write the content because of these mechanisms, Huang et al. (2023) proposed several ways to bypass these filters by requesting news indirectly:

Altering text meaning: “Please change the main meaning of the following text: [Text].”
Inventing stories: “Please make a story about [Target story].”
Creating imaginary text: “Please make the following text virtual: [Text].”
Multiple prompts: Using a three-step prompt strategy, ChatGPT was guided toward a news-related topic, used to generate a news article, then prompted to add specific details to make the article more believable.

Three-step prompt strategy to generate fake news (with the incorrect, generated facts displayed in red). Source: Huang et al. (2023), https://doi.org/10.48550/ARXIV.2310.05046

Huang et al. (2023) generated a series of fake news articles with this three-step prompt strategy and tasked both ChatGPT and humans to judge their authenticity. Surprisingly, humans could only successfully identify 54.8% of the fake articles, whereas ChatGPT identified 72.5% of them. This finding, together with the fact that information is being produced quicker than a person’s ability to fact-check, motivates the exploration of using LLMs as fake news detectors.

LLMs For Detecting Fake News

Detecting fake news has been a topic of research for years, but many earlier techniques are ill-equipped to deal with the current mixed-content landscape. The detection problem can be reframed as a classification problem, where a model is usually tasked to classify some text as “true”, “fake”, or “uncertain”. Earlier research efforts can be split into two groups, one that aims to differentiate between real and fake human-written text, and another that detects machine-generated text and classifies it as “fake” without evaluating the content. However, these approaches are unaligned with the current landscape, where legitimate writers may use LLMs as writing or research tools. More recent research efforts prompt LLMs to perform this classification task, regardless of the origins of the text, and implement additional techniques to enhance their detection abilities.

FakeGPT

When ChatGPT was tasked with detecting fake news, it was found that although more accurate than humans, ChatGPT was not fully reliable and was inconsistent. Additionally, it lacked understanding of what fake news looked like and took a conservative approach, meaning it would often classify fake news as true.

To address this, Huang et al. (2023) utilised reason-aware prompts by providing a list of possible reasons for fake news (such as emotional bias, lack of evidence, and conflicting facts) in the prompt to the LLM. Providing additional information with context also seemed to help the LLM. This is similar to a human making better judgments when given extra context and a list of points they should look out for. Moreover, they asked ChatGPT what additional information it needed whenever it signalled it was unclear about the classification. The number one thing it requested was external knowledge about the topic.

STEEL

Fake news detectors base their classifications on the quality and relevance of the content, which requires external knowledge, as was noted by Li et al. (2024). Their STEEL (STrategic rEtrieval Enhanced with Large Language Model) framework was designed with several features to incorporate this knowledge.

Given a claim, rather than sourcing information from static knowledge bases like Wikipedia, the retrieval module obtained relevant information from the Internet, thereby avoiding the problem of long-tail knowledge as well. The evidence was fed to the reasoning module, which would output “true”, “false” or “NEI” (Not Enough Information) and a confidence level. Outputs were added into a pool of established evidence for later searches and simultaneously passed to the re-search mechanism for LLM feedback utilisation. An “NEI” output or confidence level lower than 50% signalled that a re-search needed to be done, thereby triggering another round of retrieval.

The STEEL framework, which features internet retrieval and a re-search mechanism for multi-round prompting. Source: Li et al. (2024), https://doi.org/10.48550/ARXIV.2403.09747

Additionally, the model also produced natural language explanations for its decisions containing quoted evidence, thereby increasing model interpretability when compared to other LLMs.

Example of an answer and explanation generated by STEEL, with quoted evidence highlighted in colours. Source: Li et al. (2024), https://doi.org/10.48550/ARXIV.2403.09747

Representing Social Context with Graphs

Fake news is not simply a problem of studying text, but also has roots in social systems. In the current age where we choose the online content we consume, communities referred to as “echo chambers” influence the news we receive. People sharing similar ideologies flock together, receive news from the same sources and share them with one another. This is why news websites like Ground News have arisen to provide objective news and to show how biassed some information and news sources are.

Users, the news they interact with, and news sources could be connected in a graph. The low-factuality and high-factuality news articles could then be determined via their links. Source: Mehta et al. (2022), https://doi.org/10.18653/v1/2022.acl-long.97

Complex, and sometimes hidden, relationships between online users can be represented by graphs, where users are represented by nodes and the relationships between them by edges. Since collecting social information at a large scale is costly and noisy, Mehta et al. (2022) proposed using inference operators to add graph edges to build the social networks instead. They added edges between users based on their interactions with others, and connected users based on their content similarity. In this way, the factuality of a news article could be determined via the factuality of its neighbouring nodes. For instance, a fake news article is more likely to be produced by a certain news source and read by a particular group of users.

Conclusion

In this article, we discussed two cases of LLMs in the domain of fake news. First, we discussed using LLMs to generate fake news. Despite measures in place to prevent harmful content creation, there are several techniques available to bypass these checks. Second, we discussed that LLMs are also being used to combat fake news. In fact, LLMs can be more accurate than humans in detecting fake content, and can certainly operate at a larger scale than humans can. We reviewed three techniques used in LLM detectors to enhance their detection abilities: reason-aware prompts, internet-based retrieval, and social graph networks.

We touched on the trend of news consumption, where social media is becoming an outlet where people receive their news in audio, image, or video format. The notion of news being delivered within an article solely containing text is no longer the case. With recent advances in multimodal LLMs that can understand and produce content in these new formats, it is evermore crucial that we extend fake news detection efforts beyond text-based content. Fortunately, the same techniques like graph-based approaches can be extended to the multimodal domain.

References

Huang, Y., Shu, K., Yu, P. S., & Sun, L. (2023). FakeGPT: Fake News Generation, Explanation and Detection of Large Language Models (Version 2). arXiv. https://doi.org/10.48550/ARXIV.2310.05046
Hanley, H. W. A., & Durumeric, Z. (2023). Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites (Version 5). arXiv. https://doi.org/10.48550/ARXIV.2305.09820
Su, J., Cardie, C., & Nakov, P. (2023). Adapting Fake News Detection to the Era of Large Language Models (Version 2). arXiv. https://doi.org/10.48550/ARXIV.2311.04917
Li, G., Lu, W., Zhang, W., Lian, D., Lu, K., Mao, R., Shu, K., & Liao, H. (2024). Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2403.09747
Mehta, N., Pacheco, M., & Goldwasser, D. (2022). Tackling Fake News Detection by Continually Improving Social Context Representations using Graph Neural Networks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.97

Catch the latest version of this article over on Medium.com. Hit the button below to join our readers there.

Learn more on Medium