Building a Memory Chatbot Using LangGraph

This article introduces how to build a memory chatbot using LangChain and LangGraph. LangGraph is a Python library developed by the LangChain team specifically for creating complex AI workflows and multi-agent systems that can remember state.

Its core goal is to address key pain points in traditional AI orchestration:

• Inability to handle complex decision logic
• Difficulty in achieving interaction between agents
• Lack of contextual memory and state management
LangGraph solves these issues through a Directed Graph approach.

1. Install the Library

pip install langgraph

2. State Definition

class State(TypedDict):
    messages: Annotated[Sequence[BaseMessage], add_messages]  # Message history sequence
    language: str  # Conversation language setting

The State class is used for state maintenance within the LangGraph graph, defined using TypedDict to outline the conversation state:

• messages: Stores conversation history, using Annotated to support message addition
• language: Specifies the response language, supporting multilingual switching.

3. Core Component Initialization

class AIChat():
    def __init__(self):
        # Initialize language model
        self.model = ChatOpenAI(model="gpt-4o-mini")  # Using GPT-4 series model
        
        # Set prompt template
        self.prompt = ChatPromptTemplate.from_messages([
            (
                "system",  # System role prompt
                "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
            ),
            MessagesPlaceholder(variable_name="messages"),  # User message placeholder
        ])
        
        # Initialize memory system
        self.memory = MemorySaver()  # For saving conversation history
        
        # Configure message trimmer
        self.trimmer = trim_messages(
            max_tokens=65,  # Maximum token limit
            strategy="last",  # Retain the last message
            token_counter=self.model,  # Use the model's token counter
            include_system=True,  # Include system messages
            allow_partial=False,  # No partial messages allowed
            start_on="human",  # Start with human messages
        )

The initialization method sets up four key components:

1. Language model: Selects the GPT-4 series model as the core of the conversation
2. Prompt template: Defines the role and behavior of the AI assistant
3. Memory system: Manages conversation history
4. Message trimmer: Prevents context from becoming too long, maintaining conversation efficiency

4. Model Invocation Process

def call_model(self, state: State):
    # Combine prompt template and model
    chain = self.prompt | self.model
    
    # Trim message history
    trimmed_messages = self.trimmer.invoke(state["messages"])
    
    # Call model to get response
    response = chain.invoke(
        {"messages": trimmed_messages, "language": state["language"]}
    )
    
    # Return model response
    return {"messages": [response]}

This method implements the complete model invocation process:

1. Create processing chain: Connect the prompt template to the model
2. Message trimming: Ensure not to exceed token limits
3. Model invocation: Pass the processed messages and language settings
4. Response handling: Wrap the model response for return

5. Conversation Graph Construction

def create_graph(self):
    # Create state graph
    workflow = StateGraph(state_schema=State)
    
    # Add edges from start to model
    workflow.add_edge(START, "model")
    
    # Add model node
    workflow.add_node("model", self.call_model)
    
    # Compile graph and add memory functionality
    app = workflow.compile(checkpointer=self.memory)
    return app

This method constructs the workflow for the conversation system:

1. Create graph: Use the State class as the state schema
2. Define flow: Processing flow from start to model
3. Add functionality: Use the call_model method as a processing node
4. Complete construction: Compile the graph and integrate the memory system

6. Interaction Interface

def chat(self):
    # Configure conversation settings
    config = {"configurable": {"thread_id": "abc123"}}  # Conversation thread ID
    language = "Chinese"  # Set response language
    
    # Create an instance of the conversation graph
    app = self.create_graph()
    
    # Start interaction loop
    while True:
        # Get user input
        query = input("Please enter your question:")
        input_messages = [HumanMessage(query)]
        
        # Stream response processing
        for chunk, metadata in app.stream(
            {"messages": input_messages, "language": language},
            config,
            stream_mode="messages",
        ):
            # Only output the AI response part
            if isinstance(chunk, AIMessage):
                print(chunk.content, end="")
        print('\n')

The interaction interface implements the following functionalities:

1. Configuration initialization: Set conversation ID and language
2. Graph instantiation: Create the conversation processing graph
3. Interaction loop: Continuously receive user input
4. Streaming output: Display AI responses in real-time

7. Chat Examples

Please enter your question: Who am I?
You are a questioner wanting to know more information. If you are willing, you can tell me more about yourself, and I will do my best to help you!

Please enter your question: I am qqq
Hello, QQQ! Nice to meet you. How can I assist you?

Please enter your question: Who am I?
You are QQQ! If you have any other questions or want to chat about something, feel free to let me know!

Please enter your question: Who am I?
You are the one asking the question; I don’t know who exactly you are. If you are willing, you can tell me more about yourself!

At first, the large model does not know who I am. After telling it, due to the addition of memory, it can answer. However, since we used a trimmer to remove previous historical messages, it cannot answer again. The result meets expectations.

8. Complete Code

from langchain_openai import ChatOpenAI
from langchain_core.messages import BaseMessage
from langchain_core.messages import HumanMessage, SystemMessage, trim_messages, AIMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph
from langgraph.graph.message import add_messages
from typing import Sequence
from typing_extensions import Annotated, TypedDict


# Define the state in the graph
class State(TypedDict):
    messages: Annotated[Sequence[BaseMessage], add_messages]
    language: str


class AIChat():
    def __init__(self):
        # Define model
        self.model = ChatOpenAI(model="gpt-4o-mini")

        # Define prompt template
        self.prompt = ChatPromptTemplate.from_messages(
            [
                (
                    "system",
                    "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
                ),
                MessagesPlaceholder(variable_name="messages"),
            ]
        )

        # Define memory
        self.memory = MemorySaver()

        # Define trimmer
        self.trimmer = trim_messages(
            max_tokens=65,
            strategy="last",
            token_counter=self.model,
            include_system=True,
            allow_partial=False,
            start_on="human",
        )

    # Define function to call the model
    def call_model(self, state: State):
        chain = self.prompt | self.model
        trimmed_messages = self.trimmer.invoke(state["messages"])
        response = chain.invoke(
            {"messages": trimmed_messages, "language": state["language"]}
        )
        return {"messages": [response]}

    # Create graph
    def create_graph(self):
        # Define graph
        workflow = StateGraph(state_schema=State)
        # Define nodes and edges
        workflow.add_edge(START, "model")
        workflow.add_node("model", self.call_model)
        # Add memory
        app = workflow.compile(checkpointer=self.memory)
        return app

    # Chat
    def chat(self):
        # Changing configuration will reset the conversation, losing previous memory.
        config = {"configurable": {"thread_id": "abc123"}}
        language = "Chinese"

        app = self.create_graph()

        while True:
            query = input("Please enter your question:")
            input_messages = [HumanMessage(query)]
            for chunk, metadata in app.stream(
                {"messages": input_messages, "language": language},
                config,
                stream_mode="messages",
            ):
                if isinstance(chunk, AIMessage):  # Filter to just model responses
                    print(chunk.content, end="")
            print('\n')


if __name__ == "__main__":
    AIChat().chat()