Introduction

Virtual assistants have popped up on numerous websites over the years. These assistants utilise question answering (QA) systems, which are gaining traction as we continue to automate response generation, thereby improving QA applications like search engines and chatbots. As the name implies, QA involves users posing queries to machines which in turn, provide relevant, well-crafted answers. Active research aims to further enhance these systems, with one notable avenue being the integration of knowledge graphs which allow models to understand associations between entities for added context.

This article will start off with a brief introduction to QA systems. It will then discuss various subfields of QA, including conversational QA, open-domain QA, multi-document QA and visual QA. Furthermore, solutions that have been proposed to incorporate KGs into QA systems will be reviewed.

Question Answering Models

There exist many variants of QA models, but they generally fall into two broad classifications:

Extractive QA models: These simply extract the relevant answer from the given context, which could comprise text, web pages, or other forms of data.
Generative QA models: These craft their own textual responses as outputs.

Generative QA models can be further categorised into closed and open types, which draw upon implicit and explicit knowledge respectively.

In closed generative QA, the model generates answers based on the knowledge it already possesses. For instance, if the generator is a large language model (LLM), the system relies on information that the LLM has already been trained on and has internalised into its parameters to generate an output. This approach has limitations as the LLM can only memorise a finite amount of data, so it might not be able to recall supporting evidence for all potential questions. On top of that, the facts stored within the LLM are frozen in time after training, rendering the LLM incapable of utilising up-to-date information.

Meanwhile, in open generative QA, the model accesses supporting information from external knowledge bases to aid in response generation. This resembles extractive QA, but with an additional generative component that allows the model to formulate its own responses using facts from documents. External knowledge could take various forms such as text documents, Wikipedia articles, or even knowledge graphs.

A Brief Introduction to Knowledge Graphs

Example of a knowledge graph. Source: RDF 1.1 Primer https://www.w3.org/TR/rdf11-primer/

Knowledge graphs (KGs) are a data structure capable of representing relationships between entities, as demonstrated in the example image which depicts “Bob is interested in The Mona Lisa”. Nodes (depicted as circles) represent entities like people and objects, while edges (depicted as arrows) represent the relationships between the pairs of nodes. KGs allow for easier visualisation of relationships for analysis and comparison. For example, it is much easier to understand the graph example as compared to a lengthy description such as “Bob is a Person. Bob is born on 14 July 1990. Bob is a friend of Alice…”

Leveraging Knowledge Graphs for Question Answering

Conversational QA

Conversational QA involves answering questions whilst taking into account the ongoing context of the conversation. Efforts in this field have been driven by powerful language models like BERT and the GPT series, which have been trained on massive amounts of data. However, the challenge lies in ensuring QA systems can generalise to any domain and are able to switch between conversational topics without needing to be retrained for long periods of time on extensive datasets for every new topic.

KGs have proven to be a promising alternative to machine learning-based language models in the system called Converse, proposed by Oduro-Afriyie and Jamil (2023). Converse follows a three-stage process:

It initiates with a text-to-KG module, where input text is parsed, broken down and subject to rule-based analysis. Breakdown of text in this way allows for generalisation to any domain, thus circumventing the retraining issue associated with machine learning-based models.

KG constructed for the phrase “It was a beautiful orange dress”. Source: Oduro-Afriyie & Jamil, 2023. Link: https://doi.org/10.1007/978-3-031-42935-4_6

2. Next, the input query is processed in the second module to convert it into a graph, although using slightly different techniques from the first module due to the structural differences between questions and sentences. The graph is subsequently converted into a SPARQL query to extract information from the constructed KG in the first stage.

Graph constructed for the question “Who sold the dress that Alice got?” Source: Oduro-Afriyie & Jamil, 2023. Link: https://doi.org/10.1007/978-3-031-42935-4_6

3. Finally, the retrieved results from the SPARQL query are transformed into a natural language response by arranging the entities (nodes) and relationships (edges) into coherent sentences.

While this system is capable of answering simple questions, further developments are required to enhance its performance. Nevertheless, it provides some insight into the potential of knowledge graphs to execute QA tasks without the unsustainable requirements for training data and time.

Open-Domain QA

Open-domain question answering (ODQA) presents a slightly different challenge as the model is provided with a query in any domain with no other information. Despite the vast array of potential questions, the model is expected to return informative answers backed by facts. Hence, open generative QA models such as the Fusion-in-Decoder (FiD) model are commonly employed to fulfil these tasks. FiD has demonstrated good performance. Yet, two limitations were noted by Yu et al. (2022) and addressed in their extended model, KG-FiD.

Firstly, FiD failed to capture semantic relationships between text passages, which are key for better answers. Yu and colleagues opted to organise knowledge from passages into a KG, the data structure needed to encapsulate relationships as edges.

Secondly, FiD had to comb through around 100 passages for each question, resulting in high computational costs. Simply reducing the number of passages was impractical as it compromised response generation. In KG-FiD, a Graph Neural Network (GNN) was deployed to rank and filter passages to improve response accuracy and circumvent the high computational cost.

Multi-Document QA

Many works have been dedicated toward performing ODQA, but multi-document question answering (MD-QA) has garnered significantly less attention despite its numerous applications in academic research, financial and legal inquiries, and more. MD-QA involves finding relevant information from multiple data sources while understanding logical associations amongst them. Adding to the complexity of the problem, documents can contain different forms of data, including text, figures and tables, all of which the model has to be able to parse and understand.

Common types of MD-QA questions. Source: Wang et al., 2023 https://doi.org/10.48550/ARXIV.2308.11730

The figure above illustrates some common MD-QA question types, classified into bridging questions, comparing questions and structural questions. Bridging questions require models to reason sequentially, comparing questions require parallel reasoning across multiple passages, and structural questions necessitate retrieving contents from various document structures like specific pages and tables.

Once more, a KG was employed in the proposed Knowledge Graph Prompting (KGP) method as it is aptly suited for representing document structures and their relationships denoting lexical or semantic similarity. A graph traversal agent based on LLMs was trained to correctly select the relevant document structures necessary to answer the question.

Visual QA

The final field of QA discussed in this article is open-domain knowledge-based visual question answering (VQA), whereby the system is tasked to answer questions about an image with information that is not present within the image itself. For example, the question in the figure below, “Which liquid here comes from a citrus fruit?”, is presented with the image to the model without specifying which object is being referred to.

Example of the process in KRISP for VQA. Source: Marino et al., 2020 https://doi.org/10.48550/arXiv.2012.11014

One approach known as KRISP (Knowledge Reasoning with Implicit and Symbolic rePresentations) was proposed to achieve this task using a combination of implicit and explicit (or symbolic) knowledge. This is because it was observed that neither type of knowledge alone was sufficient. The implicit knowledge stemmed from internalised information within their multimodal transformer model pre-trained with BERT. Concurrently, explicit knowledge was provided from a KG, which contained the entities recognised from the image, including places, objects, attributes and more. Four types of knowledge (trivia, commonsense, scientific and situational knowledge) from various external sources were also encompassed in the KG. Implicit and symbolic answer predictions were made, and the higher scoring answer was selected as the final output. This process outperformed other methods in VQA and was able to provide rare responses, hence exhibiting KRISP’s generalisability.

Conclusion

A number of concepts have been discussed in this article, including the use of Knowledge Graphs in various areas of QA, including conversational QA, Open-Domain QA, multi-document QA and Visual QA. It is evident that Knowledge Graphs are a good alternative to presenting and retrieving data for the task of QA, especially due to their flexibility in representing any type of object or concept as nodes, and their ability to capture relationships between nodes. Most importantly, these relationships are what enhance the performance of retrieval techniques and enable models to generate more accurate and relevant responses.

References

Oduro-Afriyie, J., & Jamil, H. (2023). Knowledge Graph Enabled Open-Domain Conversational Question Answering. In Flexible Query Answering Systems (pp. 63–76). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-42935-4_6
Yu, D., Zhu, C., Fang, Y., Yu, W., Wang, S., Xu, Y., Ren, X., Yang, Y., & Zeng, M. (2021). KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering (Version 2). arXiv. https://doi.org/10.48550/ARXIV.2110.04330
Wang, Y., Lipka, N., Rossi, R. A., Siu, A., Zhang, R., & Derr, T. (2023). Knowledge Graph Prompting for Multi-Document Question Answering (Version 3). arXiv. https://doi.org/10.48550/ARXIV.2308.11730
Marino, K., Chen, X., Parikh, D., Gupta, A., & Rohrbach, M. (2020). KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2012.11014

Catch the latest version of this article over on Medium.com. Hit the button below to join our readers there.

Learn more on Medium