In this article, we will explore how Agentic RAG helps to address the limitations of traditional RAG.
RAG Framework The RAG (Retrieval-Augmented Generation) framework operates in a specific sequence:
Document -> Document Fragments -> Vector Database -> Fragment Retrieval (Top K) -> Large Language Model (LLM)
However, this order encounters obstacles when handling certain types of queries.
Problem 1: Summarization Consider a query like “summarize this document.”
The traditional RAG method retrieves the top K fragments and summarizes them. But wouldn’t it be more comprehensive to summarize all fragments in the retrieved document?
Problem 2: Document ComparisonWhen asked to compare Document A and Document B, basic RAG (Retrieval-Augmented Generation) randomly selects several fragments and attempts to compare these top K fragments.This does not accurately reflect the overall picture of the documents, as it does not cover the entire range of the documents.
Problem 3: Structured Data Analysis Consider a question like “When is the next vacation?”
The first step is to retrieve the region of the employee from a structured table. Based on that region, extract the next vacation information from the vacation policy document. In this process, the current RAG framework cannot directly accomplish this task.
Problem 4: Multi-Part Questions Consider a question like “Identify common holidays across all regions?”
Assuming you have a company holiday policy document that covers 120 countries. Since you pass the top K contexts, the maximum number of regions that can be compared is limited to K, where K is the number of chunks passed to the LLM.
Check out our Agentic RAG with LlamaIndex course, which includes 5 real-time case studies.
Agentic RAG can solve these four problems through the use of custom agents replacing traditional RAG.
Agents will interact with multiple systems. Now, RAG is part of that system, and agents can utilize it.
Agents use large language models (LLMs) to automate reasoning and tool selection.RAG is just one of the tools that agents may decide to use.Routing AgentsRouting agents are simple agents that route queries to one or more tools.An agent can route queries across one or multiple tools.Remember our question, “summarize the document,” or if we want to combine the “summary + semantic search” question, we can use the following routing example to solve it.
Query Planning AgentsQuery planning agents break down queries into sub-queries.Each sub-query can be executed in the RAG pipeline.
Tools for AgentsLLMs can have various tools, such as calling APIs and inferring API parameters.RAG is now one of the tools that LLMs may use.
SummaryWhen represented with complex problems, RAG has limitations.Some use cases, such as summarization, comparison, etc., cannot be solved by RAG alone.Embodied RAG (Agentic RAG) can help overcome the limitations of RAG.Embodied RAG views RAG as a tool that can be used for semantic search.Equipped with routing, query planning, and tools, agents can go beyond traditional RAG applications.