Introduction

Large language models (LLMs) like GPT-4 possess remarkable language abilities, allowing them to function as chatbots, translators, and much more. More recent multimodal models, like Google’s Gemini, extend the capabilities of LLMs to include vision, allowing us to generate or analyse images. However, despite their increasing capabilities, LLMs are still not fully trusted by the public. We know from chatting with ChatGPT that it can sometimes hallucinate and generate incorrect information. LLMs are typically treated as black-box models, meaning they are used without understanding their inner workings, which contributes to the distrust we have in them. Since we cannot understand how or why a LLM arrived at an answer, we cannot trust them enough to utilise them in sectors like healthcare and education, even as they get more accurate.

In this article, we will discuss why LLMs are difficult to interpret and present existing methods for explaining LLMs. We introduce knowledge graphs (KGs) as an alternative approach to explaining LLMs, inspired by some ways they have been used to explain non-LLM models. Finally, we will review some approaches that use KGs to explain reasoning processes in LLM-based models.

Interpreting Large Language Models

Most artificial intelligence models, including LLMs, aim to be both explainable AI (XAI) models and interpretable models. These two model characteristics have slight differences. To quote an article which clearly and intuitively explains this difference, “Explainable AI tells you why it made the decision it did, but not how it arrived at that decision. Interpretable AI tells you how it made the decision, but not why the criteria it used is sensible.” Both explainability and interpretability emphasise model trust and transparency. In this article, we focus on this aspect of transparency by enhancing model interpretability, by giving models the ability to explain why they make certain decisions.

LLM interpretation, as defined by Singh et al. (2024), is the “extraction of relevant knowledge from an LLM concerning relationships either contained in data or learned by the model”. This definition applies to both using LLMs to generate explanations and understanding LLMs, which are the two paths of research regarding LLM explainability.

An overview of LLM interpretation, which comprises both explaining LLMs themselves and using LLMs to generate explanations for datasets or other resources. Source: Singh et al., 2024 https://doi.org/10.48550/ARXIV.2402.01761

The techniques of employing LLMs to generate explanations for other resources like datasets are depicted in panel (C) of the above image. LLMs can provide natural language explanations that are easy to understand and are interactive, as they allow for conversation for users to seek further clarification. The second path of explaining LLMs themselves, which we are focusing on in this article, encompasses many techniques detailed in panel (B). For example, in chain of thought prompting, the LLM is asked to explain its reasoning process step-by-step before arriving at its answer.

Explaining LLMs faces three key challenges.

Hallucination: When LLMs are prompted to explain their reasoning processes, they can provide incorrect and groundless explanations.
Size: LLMs are massive, with millions of parameters, that make it impossible to manually inspect them.
Opacity and cost: LLMs are typically too large to run locally and incur large computational costs. Hence, they are accessed through APIs, meaning users do not have access to the models themselves, only their outputs. Additionally, some LLMs like GPT-4 or Google’s Gemini are proprietary, so technical details about the model architecture or training processes are hidden from the public.

These factors can limit some of the methods in panel (B) used to explain LLMs. Therefore, we want to explore KGs as an alternative method, motivated by the use of KGs in previous non-LLM models which have increased their interpretability.

Taking Inspiration from Knowledge Graphs in Non-LLM Models

Knowledge graphs have been utilised in various applications to enhance explainability, due to their inherent structure that captures relationships between entities. This enables the understanding of semantics of entities. For instance, Google’s Knowledge Graph which powers Google Search allows it to distinguish between the company “Apple” and the fruit “apple”. By tracing paths through KGs, we can also understand the step-by-step reasoning for the model to arrive at its answer. We will go through some examples of applications.

Knowledge-Aware Open-domain Conversation Generation

Having a conversation is like moving from one node to another in a KG, also referred to as multi-hop graph reasoning. With this idea, the Augmented Knowledge Graph based open-domain Chatting Machine (AKGCM) was introduced in 2019, which combined KGs and text data. The KGs allowed the machine to identify relevant topics to respond with, while the text data gave the machine the words it needed to express itself. An example conversation is in the image below, where a user discusses the movie “I Am Legend” and the model identifies “The Omega Man” as a related movie and comments about it in response. Subsequently, the paths through the KG were studied to understand how the model functioned and what it learned.

Example conversation using KGs with the AKGCM model. Source: Liu et al., 2019 https://doi.org/10.48550/ARXIV.1903.10245

Sequential Recommendation

KGs have also been employed to study the evolution of users’ interests to recommend products such as movies. In the Explainable Interaction-driven User Modeling (EIUM) algorithm, the system successfully learned the user’s short-term and long-term interests from their movie history. Integrating KGs into the algorithm also allowed discovery of different paths between the user and movie, as shown in the six paths in the image below. Each path was assigned a weight (the numbers in red). In this case, we can see that the system recommended “Star Wars: Episode VI — Return of the Jedi” because the highest weighted path, which indicates the model’s reasoning process, showed that it was a sequel to “Star Wars: Episode V — The Empire Strikes Back”.

An example of the EIUM algorithm explaining why it chose specific recommendations through path weights. Source: Huang et al., 2019 https://doi.org/10.1145/3343031.3350893

Knowledge Graphs with Large Language Models

Since KGs proved to be useful in explaining models’ processes before, there have been efforts to integrate them with more recent LLMs.

Using Attention Weights for Interpretability

In 2021, the new model called QA-GNN (Question-answering Graph Neural Network) was proposed as a means to combine knowledge from LLMs and KGs for question-answering (QA). The process is depicted in the image below, where the QA context, comprising the question and multiple answer choices, was encoded by a LLM and added into the graph. Relevance scoring was conducted to identify more relevant nodes to the QA context. A graph neural network (GNN) was used to perform reasoning by computing node-to-node attention weights, before giving probability scores as an output. Subsequently, the most probable answer was chosen.

Pipeline for QA-GNN. Source: Yasunaga et al., 2021 https://doi.org/10.48550/ARXIV.2104.06378

The part we focus on here is the attention weights from the GNN. Similar to the previous recommender system example, these attention weights are key to understanding the model’s reasoning process. They were analysed in two ways in panels (a) and (b) of the image below, where Best First Search was employed to trace the attention weights. Nodes with greater relevance to the question had greater weights, indicated by thicker connecting lines. By following these lines, we can tell precisely how the model reasoned to arrive at its answer and whether its reasoning was sound.

Attention weights leading to the answer in QA-GNN. Source: Yasunaga et al., 2021 https://doi.org/10.48550/ARXIV.2104.06378

Making Attention-Based Explanations More Understandable

In a 2023 work proposing LMExplainer, the authors argued that previous attention-based explanations in graphical formats could be inconsistent, inaccurate in representing the reasoning process, and difficult for humans to understand. The initial structure of LMExplainer resembles that of QA-GNN, which combined LLM encodings with KGs and updated node weights with a GNN. The distinction lies in the addition of an explanation generator after the GNN to generate more accurate explanations.

The explanation generator starts by sorting nodes by attention weights to retrieve the most relevant nodes for decision-making, termed reason-elements. Then, ChatGPT-3.5-turbo is employed to generate two explanations:

Why-choose: Explaining the model’s reasoning process to get to the answer.
Why-not-choose: Explaining why the model did not select other options, enhancing the transparency and interpretability of the model.

The table below shows the explanations generated by LMExplainer. Compared to other models such as PathReasoner and ECQA, the explanations here are more readable and understandable, thus instilling more confidence in the user that the model did not arrive at the answer by chance or through incorrect reasoning.

The explanations for choices generated by LMExplainer. Green and blue are used to highlight logical connectives and the reasoning framework respectively. Source: Chen et al., 2023 https://doi.org/10.48550/ARXIV.2303.16537

Conclusion

We have discussed the problems of hallucinations, size, opacity and cost, which contribute to distrust in language models and their reasoning abilities. Since these models are typically treated as black boxes, we cannot inspect why models produce right or wrong answers. We reviewed several examples of knowledge graphs increasing model interpretability for conversation and recommender systems, which prompted the exploration of using knowledge graphs in LLM-based models. QA-GNN was introduced, utilising attention weights to uncover the model’s reasoning process, followed by LMExplainer, which employed a LLM to further improve explanations by providing them in natural language and explaining both why-choose and why-not-choose instances. Moving forward, we can anticipate further research efforts to understand the inner workings of language models, which will enhance our understanding of them and propel their use in real-world applications.

References

Singh, C., Inala, J. P., Galley, M., Caruana, R., & Gao, J. (2024). Rethinking Interpretability in the Era of Large Language Models (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2402.01761
Liu, Z., Niu, Z.-Y., Wu, H., & Wang, H. (2019). Knowledge Aware Conversation Generation with Explainable Reasoning over Augmented Graphs (Version 4). arXiv. https://doi.org/10.48550/ARXIV.1903.10245
Huang, X., Fang, Q., Qian, S., Sang, J., Li, Y., & Xu, C. (2019). Explainable Interaction-driven User Modeling over Knowledge Graph for Sequential Recommendation. In Proceedings of the 27th ACM International Conference on Multimedia. MM ’19: The 27th ACM International Conference on Multimedia. ACM. https://doi.org/10.1145/3343031.3350893
Yasunaga, M., Ren, H., Bosselut, A., Liang, P., & Leskovec, J. (2021). QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering (Version 5). arXiv. https://doi.org/10.48550/ARXIV.2104.06378
Chen, Z., Singh, A. K., & Sra, M. (2023). LMExplainer: a Knowledge-Enhanced Explainer for Language Models (Version 2). arXiv. https://doi.org/10.48550/ARXIV.2303.16537

Catch the latest version of this article over on Medium.com. Hit the button below to join our readers there.

Learn more on Medium