Source: Knowledge Graph Technology
This article has 2500 words and is recommended to read in 5 minutes.
This article will help you understand how knowledge graphs can improve Retrieval-Augmented Generation (RAG) for information retrieval within companies.
In this part of our AI series for knowledge management, you will learn how knowledge graphs enhance Retrieval-Augmented Generation (RAG) for information retrieval within companies.
Advanced RAG Process
Introduction
In AI systems used for corporate knowledge management, Retrieval-Augmented Generation (RAG) is a popular architecture that overcomes some limitations of large language models (LLMs).
However, RAG has limitations, including difficulty handling a mix of structured and unstructured corporate data. One way to address these limitations is to combine RAG with knowledge graphs (KG).
In this article, we will explain how Graph RAG (GRAG) enhances traditional RAG methods by providing more accurate and context-rich answers through the use of knowledge graphs.
This should not be confused with other (complementary) methods, where LLMs are used to extract structured information to build knowledge graphs (also known as “Graph RAG”), as seen in Microsoft’s recent library.
The post consists of five topics:
1. Overview: Limitations of LLMs and Introduction to RAG
2. Issues: Limitations of Traditional RAG
3. Popular Science: What is a Knowledge Graph?
4. Solution: Introduction to GRAG
5. In-depth: Understanding the GRAG Process
6. Impact: Performance Implications of GRAG
1. Overview: Limitations of LLMs and Introduction to RAG
Large Language Models (LLMs), such as Llama or Gemini, generate text based on extensive training data. Despite the impressive capabilities of LLMs, they have some limitations in corporate knowledge retrieval:
Inaccessibility of Private Information: LLMs are trained on publicly available data, thus lacking company-specific private knowledge.
1. Hallucination: It is well known that LLMs often produce plausible but completely incorrect responses, referred to as “hallucinations.”
2. Static Knowledge: LLMs’ knowledge is static and limited to the data included in their most recent training.
This means that while LLMs excel at generating text, they perform poorly in knowledge management. Enter Retrieval-Augmented Generation (RAG).
Simple RAG
Retrieval-Augmented Generation (RAG) is an AI architecture that combines external data sources with LLMs (Large Language Models). Its operation is divided into two steps:
1. Retrieval:Retrieve relevant information (“context”) from a database (e.g., corporate knowledge base) using user queries.
2. Generation:Instruct the LLM to answer the user’s query based on the retrieved context.
By providing context as a reference for the LLM, RAG addresses the limitations mentioned earlier. For more background on how basic RAG works and how it integrates with LLMs, refer to our previous introductory article or this detailed summary from AppliedAI.
2. Motivation: Limitations of RAG
Despite its advantages and popularity, RAG still has limitations when applied to knowledge management. These limitations relate to contextual retrieval of specific company data:
Poor Retrieval with General Models: Retrieval models (embedding encoders) are often trained on internet data, making it difficult for them to find the correct context in company knowledge specific to certain domains.
Handling of Misspellings vs. Non-Misspellings: Embedding encoders can usually tolerate misspellings. This is helpful for general queries (e.g., searching for “curiosty” vs. “curiosity”), but can be problematic for context-specific queries, such as “Airbus A320” vs. “Airbus A330.”
Providing incorrect context to the LLM can contaminate answers with incorrect or fabricated facts. Offering seemingly reasonable answers but with incorrect information can undermine user confidence in the system and, worse, lead to real-world errors.
Some of these issues can be addressed with prompt templates that instruct the LLM to ignore irrelevant information, but this can only improve results to a certain extent.
Knowledge Graph RAG (GRAG) is an exciting approach to addressing these limitations of company data.
Popular Science: What is a Knowledge Graph?
A knowledge graph is a representation of information that includes entities and their relationships. This information is typically stored in graph databases such as Neo4J or Curiosity and has two main components:
Nodes: These represent entities such as objects, documents, or people. For example, a node could be a company (Curiosity) or a location (Munich).
Edges: These define the relationships between nodes. For example, an edge might indicate that Curiosity is located in Munich.
Depending on the type of graph database, nodes and edges can also have attributes.
Knowledge Graphs represent entities (documents, people, etc.) and edges (relationships)
Solution: Introduction to Graph RAG
Graph RAG (GRAG) utilizes knowledge graphs to enhance the performance of RAG in knowledge retrieval. It operates by using structured and linked information stored in the graph to improve retrieval and is increasingly popular (e.g., here).
GRAG works by adding steps to the standard RAG process. It uses information from the graph to enhance the retrieval and filtering of results, which are then sent as context to the LLM to generate answers.
Advanced RAG (Without Knowledge Graph)
GRAG enhances standard RAG in three main ways:
In addition to document embeddings, it allows you to calculate graph embeddings using entities captured in the documents (e.g., part numbers, process references, etc.). These represent meaningful connections in the data and help filter out noise in long documents.
It allows you to filter results based on user context (e.g., department) and entities captured in the query.
It allows you to boost/demote results based on context and captured entities.
The combination of these techniques helps retrieve the best context for the LLM to accurately answer questions while considering structured information.
In-depth: Understanding the GRAG Process
Diving deeper, the following sections provide a quick overview of how to build a GRAG flow.
Step 1: Preprocess Data and Build Knowledge Graph (Index Time)
Preprocess unstructured text data (including chunking) and add it to the graph, using Named Entity Recognition (NER) to extract entities, references, and relationships.
Add structured data to the knowledge graph (e.g., machine references).
Connect structured data, text, and captured entities to the knowledge graph, creating representations of connections between any given document and other data in the graph.
Train document embedding models on unstructured text.
Train graph embedding models on object structures connected to text in the graph, for example, using OSS library Catalyst.
Step 2: Process User Queries (Query Time)
When a user submits a query, analyze it to identify key entities and references (e.g., machine references and time periods) to filter candidate documents.
Encode the query text using document embedding models.
Encode the connections of objects linked to the documents using graph embedding models.
Step 3: Retrieve and Filter Context (Query Time)
Combine graph and document embeddings to find the best candidate results (also known as “vector search”), for example, using OSS library HNSW.
Filter candidate results using graph structures, for example, using user context and entities captured from the query.
Further optimize candidate results by boosting or demoting them based on connections in the graph.
Step 4: Generate Answers (Query Time)
Provide the user query to the LLM and instruct it to respond based on the context provided by the top-ranked results (chunks).
Return the response to the user, including references to the results (“sources”) to enhance transparency and credibility.
Impact: The Implications of GRAG
Setting up knowledge graphs, extracting entities, training additional models, and adding boosting logic sounds like a daunting task. So, is it really worth it?
From our perspective, the answer is a resounding “Yes!” First, our tightly integrated solution Curiosity enables us to transition from documents to a complete GRAG solution in just a few days.
However, more importantly, the quality of the context provided to the LLM has significantly improved due to preprocessing and filtering steps, which in turn enhances the quality of the responses.
In a project in 2023, we measured the impact of GRAG compared to naive RAG on a large number of proprietary technical documents.
Graph RAG Performance Improvement
We used NDCG scores (Normalized Discounted Cumulative Gain) to measure performance, which evaluates the performance of ranking retrieval systems by examining the relevance of the ordered lists they return.
The impact of using graphs for retrieval is profound: for the top 10 documents, the native RAG system’s NDGC score was 29. Estimating the impact of LLM updates, we estimated it could increase to 38. However, with Graph RAG, the NDGC score soared to 67.
The improvements from using the graph were confirmed in tests with expert users, indicating that GRAG is the right choice—at least for this application.
Conclusion
In conclusion, GRAG enhances traditional RAG models by integrating preprocessing and structure into knowledge graphs. This helps account for the complex relationships in corporate data, improving document retrieval and context quality. The improved context aids the LLM in generating more accurate and contextually rich answers.
At Curiosity, we plan to continue using and improving GRAG, and we are excited about the possibilities it brings for corporate knowledge management. We are also interested in how to use LLMs to build knowledge graphs that complement GRAG, thereby completing the interaction loop between LLMs and graph databases.
Editor: Yu TengkaiProofreader: Qiu Tingting
About Us
Data Tribe THU, as a data science public account, is backed by the Tsinghua University Big Data Research Center, sharing cutting-edge data science and big data technology innovation research dynamics, continuously disseminating data science knowledge, and striving to build a platform for gathering data talent, creating the strongest group in China’s big data.