Enhancing RAG Effectiveness with LangChain and LangGraph

Introduction on how to use LangGraph to improve RAG.

1. Introduction

LangGraph is the latest member of the LangChain, LangServe, and LangSmith series, aimed at building generative AI applications using LLMs. Remember, all these are independent packages and must be installed separately via pip.

Before diving into LangGraph, it is important to understand two main concepts of LangChain.

1. Chain: A program built around LLMs to perform tasks such as automatic SQL writing or NER extraction chains. Note that chains cannot be used for any other tasks (not even general use cases), and attempting to do so may damage the chain. The steps to follow in a chain are predefined and cannot be flexibly adjusted.

2. Agent: A more flexible version of a chain, agents are typically LLMs that enable third-party tools (like Google Search, YouTube) and decide how to resolve a given query next.

Now, when dealing with real-world problems, a common issue is the desire to find a solution that lies between chains and agents. That is, not hardcoded like a chain, but also not entirely driven by LLMs like an agent.

2. LangGraph

LangGraph is a tool centered on LangChain for creating cyclic graphs in workflows. So, let’s assume the following example:

You want to build a retrieval system based on RAG on a knowledge base. Now, you want to introduce a situation where if the output of RAG does not meet specific quality requirements, the agent/chain should retrieve data again, but this time change the prompt by itself. And repeat this process until the quality threshold is reached.

Using LangGraph can achieve this cyclic logic. This is just one example; using LangGraph can accomplish much more.

Note: You can think of it as introducing cyclic logic into the chain, making it a cyclic chain.

LangGraph is crucial for building multi-agent applications like Autogen or MetaGPT.

As the name suggests, LangGraph has all the components of a general graph, such as nodes, edges, etc., which we will understand through an example.

3. Improving RAG with LangGraph

In this example, we wish to reduce the final output of the RAG system in the database to no more than 30 characters. If the output length exceeds 30 characters, we want to introduce a loop to try again with a different prompt until the length is less than 30 characters. This is a basic logic for demonstration purposes. You can even implement complex logic to improve RAG results.

The graph we will create is shown below.

The versions used here are langchain===0.0.349, openai===1.3.8, langgraph===0.0.26.

3.1 First, let’s import the important content and initialize the LLM. Here, we use the OpenAI API, but you can also use other LLMs.

from typing import Dict, TypedDict, Optional
from langgraph.graph import StateGraph, END
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
from langchain.embeddings.openai import OpenAIEmbeddings

llm = OpenAI(openai_api_key='your API')

Next, we will define a StateGraph.

class GraphState(TypedDict):
    question: Optional[str] = None
    classification: Optional[str] = None
    response: Optional[str] = None
    length: Optional[int] = None
    greeting: Optional[str] = None

workflow = StateGraph(GraphState)

What is StateGraph?

StateGraph is the core of any LangGraph process, storing the states of various variables we will store while executing the workflow. In this example, we have 5 variables whose values will be updated during the execution of the graph and shared with all edges and nodes.

3.2 Next, let’s initialize a RAG retrieval chain from an existing vector database. The code is explained in the following video.

def retriever_qa_creation():
        embeddings = OpenAIEmbeddings()
        db = Chroma(embedding_function=embeddings,persist_directory='/database',collection_name='details')
        qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=db.as_retriever())
        return qa

rag_chain = retriever_qa_creation()

3.3 Next, we will add nodes to the graph.

def classify(question):
    return llm("classify intent of given input as greeting or not_greeting. Output just the class.Input:{}.format(question)).strip()

def classify_input_node(state):
    question = state.get('question', '').strip()
    classification = classify(question) 
    return {"classification": classification}

def handle_greeting_node(state):
    return {"greeting": "Hello! How can I help you today?"}

def handle_RAG(state):
    question = state.get('question', '').strip()
    prompt = question
    if state.get("length")<30:
         search_result = rag_chain.run(prompt)
    else:
         search_result = rag_chain.run(prompt+'. Return total count only.')

    return {"response": search_result,"length":len(search_result)}

def bye(state):
    return{"greeting":"The graph has finished"}

workflow.add_node("classify_input", classify_input_node)
workflow.add_node("handle_greeting", handle_greeting_node)
workflow.add_node("handle_RAG", handle_RAG)
workflow.add_node("bye", bye)

This requires some explanation.

Each node is a Python function that can:

① Read any state variable.

② Update any state variable. In this case, the return function of each node will update the status/value of one or more state variables.

Use state.get() to read any state variable.
handle_RAG node can help us implement the custom looping logic we desire. If the output length <30, use prompt A; otherwise, use prompt B. For the first case (when RAG node hasn’t been executed), we will pass length=0 and provide a prompt.

3.4 Next, we will add entry points and edges.

workflow.set_entry_point("classify_input")
workflow.add_edge('handle_greeting', END)
workflow.add_edge('bye', END)

In the above code snippet,

We added an entry point to the graph, which will execute the first node function regardless of what the input prompt is.
The edges between nodes A and B define that node B will execute after node A. In this case, if handle_greeting or bye appears in our workflow, the graph should END (a special node to terminate the workflow).

3.5 Next, let’s add conditional edges.

def decide_next_node(state):
    return "handle_greeting" if state.get('classification') == "greeting" else "handle_RAG"

def check_RAG_length(state):
    return "handle_RAG" if state.get("length")>30 else "bye"

workflow.add_conditional_edges(
    "classify_input",
    decide_next_node,
    {
        "handle_greeting": "handle_greeting",
        "handle_RAG": "handle_RAG"
    }
)

workflow.add_conditional_edges(
    "handle_RAG",
    check_RAG_length,
    {
        "bye": "bye",
        "handle_RAG": "handle_RAG"
    }
)

Conditional edges can choose between two nodes based on conditions (like if-else). In the two conditional edges created:

First Conditional Edge

When encountering classify_input, choose handle_greeting or handle_RAG based on the output of decide_next_node function.

Second Conditional Edge

If encountering handle_RAG, choose handle_RAG or bye based on the check_RAG_length condition.

3.6 Compile and invoke the prompt. Initially, keep the `length` variable set to 0.

app = workflow.compile()
app.invoke({'question':'Mehul developed which projects?','length':0})

# Output
{'question': 'Mehul developed which projects?',
 'classification': 'not_greeting',
 'response': ' 4',
 'length': 2,
 'greeting': 'The graph has finished'}

For the above prompt, the graphical flow is as follows:

classify_input: The sentiment will be not_greeting.

Due to the first conditional edge, move to handle_RAG.

Since length=0, use the first prompt and retrieve the answer (the total length will be greater than 30).

Due to the second conditional edge, move again to handle_RAG.

Since length>30, use the second prompt.

Due to the second conditional edge, move to bye.

END.

If LangGraph was not used:

rag_chain.run("Mehul developed which projects?")

# Output
"Mehul developed projects like ABC, XYZ, QWERTY. Not only these, he has major contribution in many other projects as well at OOO organization"

3.7 Next input.

app.invoke({'question':'Hello bot','length':0})

# Output
{'question': 'Hello bot',
 'classification': 'greeting',
 'response': None,
 'length': 0,
 'greeting': 'Hello! How can I help you today?'}

The flow here will be simpler.

classify_input: The sentiment will be greeting.

Due to the first conditional edge, move to handle_greeting.

END.

While the conditions applied here are quite simple, this framework can easily be used to improve your results by adding more complex conditions.

Enhancing RAG Effectiveness with LangChain and LangGraph

1. Introduction

2. LangGraph

3. Improving RAG with LangGraph

3.1 First, let’s import the important content and initialize the LLM. Here, we use the OpenAI API, but you can also use other LLMs.

3.2 Next, let’s initialize a RAG retrieval chain from an existing vector database. The code is explained in the following video.

3.3 Next, we will add nodes to the graph.

3.4 Next, we will add entry points and edges.

3.5 Next, let’s add conditional edges.

3.6 Compile and invoke the prompt. Initially, keep the `length` variable set to 0.

3.7 Next input.

Recommended Reading List

“Practical Machine Learning Platform Architecture”

Exciting Reviews

Leave a Comment Cancel reply

1. Introduction

2. LangGraph

3. Improving RAG with LangGraph

3.1 First, let’s import the important content and initialize the LLM. Here, we use the OpenAI API, but you can also use other LLMs.

3.2 Next, let’s initialize a RAG retrieval chain from an existing vector database. The code is explained in the following video.

3.3 Next, we will add nodes to the graph.

3.4 Next, we will add entry points and edges.

3.5 Next, let’s add conditional edges.

3.6 Compile and invoke the prompt. Initially, keep the length variable set to 0.

3.7 Next input.

Recommended Reading List

“Practical Machine Learning Platform Architecture”

Exciting Reviews

Leave a Comment Cancel reply

3.6 Compile and invoke the prompt. Initially, keep the `length` variable set to 0.