Introduction

Question Answering (QA), the ability to interact with data using natural language questions and obtaining accurate results, has been a long-standing challenge in computer science dating back to the 1960s. Typically, QA tasks aim to answer questions in natural language based on large-scale unstructured passages such as Wikipedia. With advancements in deep learning and transformer architectures, QA models have now achieved remarkable performance, enabling them to tackle a wide range of questions across diverse domains.

Overview of how a simple QA model works. The input question (light yellow), along with a number of retrieved relevant documents such as Wikipedia passages (light blue), passes through the QA model (here it is a generative seq2seq model) to produce an output response. (Andriopoulos K et al., 2023) Article Link: https://arxiv.org/abs/2309.16459

A knowledge graph, such as Wikidata, contains rich relational information between entities, many of which can be further mapped to corresponding mentions in questions and retrieved passages. The research of combining QA related problems with Knowledge Graphs has advanced throughout the past few years in multiple directions, including general QA methods and approaches for its benchmarking.

In this post, we introduce some of the latest findings for the use of Knowledge Graph in the field of QA, as well as its use in a similar field called Information Seeking Question(ISQ) generation.

GRAPE

A common thread of modern open-domain QA models employs a retriever-reader pipeline, in which a retriever aims to retrieve a handful of relevant passages (e.g. a given question), and a reader aims to infer a final answer from the received passages. Although these methods have achieved remarkable advances on various open-domain QA benchmarks, state-of-the-art readers, such as Fusion-in-Decoder, still often produce answers that contradict the facts.

In this case, instead of improving the performance of the retrievers, a novel knowledge Graph enhanced passage reader, namely GRAPE (Graph-based RAge PErception), is proposed these days to improve the reader performance for open-domain QA.

GRAPE adopts a retriever-reader pipeline as well. Specifically, when given a question:

It first utilises a model called DPR to retrieve top-k relevant passages from Wikipedia.
Then, to peruse the retrieved passages, it constructs a localised bipartite graph (a graph whose vertices can be partitioned into two sets such that no two vertices within the same set are connected by an edge) for each pair of question and passage. The constructed graphs possess tractable yet rich knowledge about the facts among connected entities.
Finally, with the curated graphs, structured facts are learned through a relation-aware graph neural network (GNN, which is usually used to perform classification or prediction on graph structured data) to generate the answers.

Given a pair of question and passage, the proposed GRAPE constructs a localised bipartite graph. For instance, in this passage, Entities are extracted from Question and passage separately and inserted into two different special tokens before them to build a bipartite graph for later usage. (Ju M et al., 2022) Article Link: https://arxiv.org/abs/2210.02933

Elaborating the figure above, GRAPE analyses a question and the retrieved passage. It identifies important entities (like names, places, etc.) in both the question and passage. These entities are then marked with special tokens in both the question and passage text. Finally, a bipartite graph is constructed where there are two sets of nodes: one for question entities and one for passage entities. Edges are drawn between question and passage entities if they seem related based on the text.

It is proved that GRAPE can achieve superior performance on different sets of benchmarks when compared with other baselines given the same retriever and the same set of retrieved passages, showing exciting potential for future applications.

Benchmarking

Benchmarking is an important aspect when it comes to the evaluation of a QA model since it can provide a standardised framework for comparison with different baselines. However, according to the latest research, existing Question Answering and Text-to-SQL (a task to automatically generate SQL queries from natural language text) benchmarks, although valuable, are often misaligned with real-world enterprise settings. For example, these benchmarks often disregard questions that are crucial for operational and strategic planning in an enterprise, including questions related to business reporting, metrics, and key performance indicators (KPIs). For example, in a company like Accenture, QA models might be used to answer various simple questions related to this enterprise, but it can fail when it is asked to generate a business report for a certain project.

In this case, the article A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model’s Accuracy for Question Answering on Enterprise SQL Databases (Sequeda et al., 2023) was proposed recently to compare the performance of Large Language Models (LLMs) on answering enterprise Natural Language Questions and enterprise SQL Questions. The benchmarks consist of an Enterprise SQL Schema, 43 Natural Language questions and a Context Layer that can be used to create a knowledge graph representation of the SQL database. In this case, it’s more focused on investigating the extent that LLM can accurately answer Enterprise Natural Language questions over Enterprise SQL databases and to what extent Knowledge Graphs can improve the accuracy when compared with the traditional approach.

Experimental results using this newly developed benchmark show that the model built over a Knowledge graph representation generally has a higher accuracy when it comes to solving QA problems in both Natural Language Questions and SQL Questions setup, which can provide some avenues for future research.

ISEEQ

Instead of being applied to Question Answering, Knowledge Graphs can be used to enhance the question generation process as well. In the field of Information Seeking (IS) which is a complex and structured process in human learning that demands lengthy discourse between seekers and providers to meet the information needs of the seeker, the provider can ask the seeker information-seeking questions to understand the seeker’s needs better and respond appropriately. For instance, clinicians use their experience or medical knowledge to ask patients information-seeking questions (ISQs), that describe their health conditions (e.g. if the patient is experiencing hair loss, it could imply an iron deficiency. So clinicians ask probing questions related to that).

To address some current challenges and the open problem of ISQ generation, an Information SEEking Question generator (ISEEQ) enhanced by the Knowledge Graph was proposed recently. Main components of ISEEQ include: semantic query expander (SQE), knowledge-aware passage retriever (KPR) and generative-adversarial Reinforcement Learning-based question generator (ISEEQ-RL) with Entailment constraints (ISEEQ-ERL).

Specifically:

SQE: Expand the possibly short user input queries by extracting information iteratively using ConceptNet Commonsense Knowledge Graph (CNetKG).
KPR: retrieve passages from a set and rank them to get top-K passages.
ISEEQ-RL and ISEEQ-ERL: Question Generation (QG) models that learn QG in generative-adversarial settings (where two different parts within the model, generator and discriminator, are competing with each other on their need) guided by a reward function. The difference between ISEEQ-RL and ISEEQ-ERL mainly lies in their loss functions.

Overview of ISEEQ. ISEEQ combines a BERT-based constituency parser, Semantic Query Expander (SQE), and Knowledge-aware Passage Retriever (KPR) to provide relevant context to a Question Generation (QG) model for ISQ generations.(Gaur M et al., 2021) Article Link: https://arxiv.org/abs/2112.07622

Experimental results show that ISEEQ outperformed competitive baselines when using common-sense knowledge, entailment constraints, and self-guiding through reinforcement learning, trained within a supervised generative-adversarial setting, which can open some exciting paths for future research.

Conclusion

The field of Question Answering has witnessed significant advancements, particularly with the use of Knowledge Graphs. This can open multiple exciting possibilities for various research directions including modelling techniques and benchmarking methodologies. In this post, we discussed three different new findings for the use of Knowledge Graphs in the field of QA or ISQ generation. Moreover, we have also shared some promising scopes that could be used for future exploration of these approaches. Through continuous investigation and refinement, the use of Knowledge Graph can open up exciting opportunities in the field of QA.

References

Andriopoulos, K., & Pouwelse, J. (2023). Augmenting LLMs with Knowledge: A survey on hallucination prevention (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2309.16459
Ju, M., Yu, W., Zhao, T., Zhang, C., & Ye, Y. (2022). Grape: Knowledge Graph Enhanced Passage Reader for Open-domain Question Answering (Version 2). arXiv. https://doi.org/10.48550/ARXIV.2210.02933
Sequeda, J., Allemang, D., & Jacob, B. (2023). A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model’s Accuracy for Question Answering on Enterprise SQL Databases (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2311.07509
Gaur, M., Gunaratna, K., Srinivasan, V., & Jin, H. (2021). ISEEQ: Information Seeking Question Generation using Dynamic Meta-Information Retrieval and Knowledge Graphs (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2112.07622

Catch the latest version of this article over on Medium.com. Hit the button below to join our readers there.

Learn more on Medium