In the rapidly evolving field of AI, the ability to provide accurate, context-aware responses to user queries is a game changer. Retrieval-Augmented Generation (RAG) is a powerful paradigm that combines the retrieval of relevant information from external sources with the generative capabilities of large language models (LLMs). However, as queries become increasingly complex and diverse, static RAG setups may not always be sufficient. This is where Agentic RAG comes into play.
Agentic RAG introduces an intelligent modular framework where specialized agents work together to dynamically analyze, route, and respond to user queries. Each agent has a specific role—whether routing questions, retrieving information from vector stores, conducting web searches, or generating responses using LLMs. This agent-based design not only enhances flexibility but also improves the efficiency and accuracy of the RAG process.
Required Installations
To set up and run the Agentic RAG framework, you need to install several Python libraries that form the foundation of this implementation. Here are the required installations and their purposes:
1. Install CrewAI
CrewAI provides the infrastructure to create agents, tasks, and workflows, enabling the seamless construction of modular and intelligent agent-based systems.
pip install crewai
2. Install LangChain OpenAI
LangChain provides tools to work with LLMs, allowing tasks and queries to be linked efficiently. The specific <span>langchain_openai</span>
package is needed for ChatGPT integration.
pip install langchain_openai
pip install langchain_community
Verify API Keys
Ensure you have configured the necessary API keys:
-
• OpenAI API Key for LLM. -
• **Serper API Key**[1] for Google search-based queries.
OPENAI_API_KEY=<your_openai_api_key>
SERPER_API_KEY=<your_serper_api_key>
Import Necessary Libraries:
from langchain_openai import ChatOpenAI
import os
from crewai_tools import PDFSearchTool
from crewai_tools import tool
from crewai import Crew
from crewai import Task
from crewai import Agent
from crewai.tools import BaseTool
from pydantic import Field
from langchain_community.utilities import GoogleSerperAPIWrapper
Define LLM:
llm=ChatOpenAI(model_name="gpt-4o-mini", temperature=0)
Define Agents:
Router_Agent = Agent(
role='Router',
goal='Route user questions to vector store or web search',
backstory=(
"You are an expert at routing user questions to vector stores or web searches."
"Use the vector store for questions about transformers or differential transformers."
"For recent news or trending topics, use web search."
"For general questions, use generation."
),
verbose=True,
allow_delegation=False,
llm=llm,
)
Retriever_Agent = Agent(
role="Retriever",
goal="Answer questions using information retrieved from the vector store",
backstory=(
"You are an assistant for Q&A tasks."
"Use information retrieved from the context to answer questions."
"You must provide clear and concise answers."
),
verbose=True,
allow_delegation=False,
llm=llm,
)
<span>Router_Agent</span>
:
-
• Role: Determine the best tool for handling user queries. -
• Logic: For domain-specific queries (e.g., “transformer” or “differential transformer”), route to vector store. For recent topics or news, route to web search. For general questions, use generation. -
• Details: Do not delegate tasks and provide clear routing based on queries.
<span>Retriever_Agent</span>
:
-
• Role: Retrieve and provide answers based on routing decisions. -
• Logic: Use vector store, web search, or generation tools based on query type. -
• Details: Focus on providing clear, concise answers without additional delegation.
These two agents work together to simplify the RAG process by efficiently analyzing and handling queries.
Define Tools:
search = GoogleSerperAPIWrapper
class SearchTool(BaseTool):
name: str = "Search"
description: str = "For search-based queries. Use this tool to find the latest information about markets, companies, and trends."
search: GoogleSerperAPIWrapper = Field(default_factory=GoogleSerperAPIWrapper)
def run(self, query: str) -> str:
"""Execute search query and return results"""
try:
return self.search.run(query)
except Exception as e:
return f"Error executing search: {str(e)}"
class GenerationTool(BaseTool):
name: str = "Generation_tool"
description: str = "For general knowledge-based queries. Use this tool to find information based on your own knowledge."
#llm: ChatOpenAI(model_name="gpt-4o-mini", temperature=0)
def run(self, query: str) -> str:
llm=ChatOpenAI(model_name="gpt-4o-mini", temperature=0)
"""Execute search query and return results"""
return llm.invoke(query)
generation_tool=GenerationTool()
web_search_tool = SearchTool()
1. SearchTool
Purpose: Handle search-based queries to retrieve current information (e.g., market trends, company details, or general online information).
Key Components:
-
• Name: <span>"Search"</span>
-
• Description: Highlights its use for search-related queries. -
• Core Mechanism: Executes queries using <span>GoogleSerperAPIWrapper</span>
. -
• Error Handling: Captures and returns error messages when queries fail.
Usage: Suitable for real-time dynamic searches that require the latest information.
2. GenerationTool
Purpose: Handle general knowledge-based queries using LLM.
Key Components:
-
• Name: <span>"Generation_tool"</span>
-
• Description: Used to generate responses based on pre-trained knowledge. -
• Core Mechanism: Instantiates a <span>ChatOpenAI</span>
object (configured for<span>gpt-4o-mini</span>
and<span>temperature=0</span>
for deterministic output). Executes queries through<span>llm.invoke(query)</span>
.
Usage: Best suited for queries that do not rely on external data but rather on reasoning or static knowledge.
<span>web_search_tool = SearchTool()</span>
: Create an instance of <span>SearchTool</span>
for real-time queries.
<span>generation_tool = GenerationTool()</span>
: Create an instance of <span>GenerationTool</span>
for generation tasks.
These tools seamlessly integrate into the RAG framework, allowing agents to dynamically route and respond to queries based on the nature of the information required.
3. PDF Search Tool:
PDF file link:https://arxiv.org/pdf/2410.05258[2], you can use any PDF file as needed.
pdf_search_tool = PDFSearchTool(
pdf="differential transformer.pdf",
)
This code snippet initializes a <span>PDFSearchTool</span>
to search within a specific PDF file. Here’s a brief overview:
<span>PDFSearchTool</span>
Overview
-
• Purpose: Allows querying and retrieving information from the provided PDF file.
Initialization:
-
• The tool is instantiated by the path of the PDF file, in this case, <span>"differential transformer.pdf"</span>
. -
• This means queries related to the content of that PDF will be routed here.
How It Works
-
• When integrated into the framework (e.g., within <span>retriever_task</span><span>):</span>
-
• If it is determined that a query requires a vector store search (based on keywords like “transformer” or “differential transformer”), the <span>PDFSearchTool</span>
will be used. -
• The tool parses and searches the content of the specified PDF to provide relevant information.
Define Agent Tasks:
router_task = Task(
description=("Analyze the keywords in the question {question}"
"Decide based on the keywords whether it is suitable for vector store search, web search, or generation."
"If suitable for vector store search, return the word 'vectorstore'."
"If suitable for web search, return the word 'websearch'."
"If suitable for generation, return the word 'generate'."
"Do not provide any other preamble or explanation."
),
expected_output=("Give the choice 'websearch', 'vectorstore', or 'generate' based on the question"
"Do not provide any other preamble or explanation.") ,
agent=Router_Agent,
)
retriever_task = Task(
description=("Use the appropriate tool to extract information for the question {question} based on the response from the routing task."
"If the output of the routing task is 'websearch', use web_search_tool to retrieve information from the web."
"If the output of the routing task is 'vectorstore', use rag_tool to retrieve information from the vector store."
"Otherwise, if the output of the routing task is 'generate', generate output based on your own knowledge."
),
expected_output=("You should analyze the output of 'router_task'."
"If the response is 'websearch', use web_search_tool to retrieve information from the web."
"If the response is 'vectorstore', use rag_tool to retrieve information from the vector store."
"If the response is 'generate', use generation_tool."
"Otherwise, if you don't know the answer, say I don't know."
"Return clear and concise text as a response.") ,
agent=Retriever_Agent,
context=[router_task],
tools=[pdf_search_tool,web_search_tool,generation_tool],
)
1. router_task
Purpose: Determine how to route based on the content of user queries.
Description Logic:
-
• Analyze the keywords in the query ( <span>{question}</span>
). -
• Determine whether the query should: -
• Use vectorstore (if related to transformers or technical terms like “differential transformer”). -
• Perform web search (if the question involves recent topics, news, or dynamic content). -
• Use generation (for general, knowledge-based queries). -
• Return a word ( <span>'vectorstore'</span>
,<span>'websearch'</span>
, or<span>'generate'</span>
) as the routing decision. -
• Using Agent: <span>Router_Agent</span>
.
2. retriever_task
Purpose: Execute the appropriate tools or actions based on the output of <span>router_task</span>.
Description Logic:
-
• Read the routing decision from <span>router_task</span><span>:</span>
-
• <span>'websearch'</span>
: Use<span>web_search_tool</span><span> to retrieve information from the web.</span>
-
• <span>'vectorstore'</span>
: Use<span>rag_tool</span><span> (PDF search or other vector-based retrieval) for domain-specific queries.</span>
-
• <span>'generate'</span>
: Use<span>generation_tool</span><span> to generate responses leveraging LLM capabilities.</span>
-
• If none of the above apply, output <span>"I don't know"</span><span>.</span>
-
• Ensure responses are concise and context-relevant. -
• Using Agent: <span>Retriever_Agent</span>
. -
• Using Tools: Combine <span>pdf_search_tool</span>
,<span>web_search_tool</span>
, and<span>generation_tool</span>
.
These tasks work in conjunction with <span>Router_Agent</span><span> and </span><code><span>Retriever_Agent</span><span> to efficiently and intelligently handle various queries.</span>
Define Crew:
rag_crew = Crew(
agents=[Router_Agent, Retriever_Agent],
tasks=[router_task, retriever_task],
verbose=True,
)
<span>rag_crew</span>
defines an instance of Crew to coordinate interactions between agents and tasks in the Agentic RAG framework.
Coordination: <span>rag_crew</span>
ensures seamless collaboration between agents and tasks.
Workflow:
-
• Queries are processed through <span>Router_Agent</span><span> using </span><code><span>router_task</span><span>.</span>
-
• The decisions of <span>router_task</span><span> are executed by </span><code><span>Retriever_Agent</span><span> using </span><code><span>retriever_task</span><span>.</span>
The Crew serves as the central hub managing the entire RAG process, making it modular, efficient, and easy to scale to meet future capability demands.

Using the Crew:
result = rag_crew.kickoff(inputs={"question":"What is differential transformer?"})
Output:

result = rag_crew.kickoff(inputs={"question":"What is AI?"})

result = rag_crew.kickoff(inputs={"question":"What is weather in Bengaluru?"})

Citation Links
<span>[1]</span>
**Serper API Key**: https://serper.dev/<span>[2]</span>
https://arxiv.org/pdf/2410.05258: https://arxiv.org/pdf/2410.05258
