This article introduces how to build a memory chatbot using LangChain and LangGraph. LangGraph is a Python library developed by the LangChain team specifically for creating complex AI workflows and multi-agent systems that can remember state.
Its core goal is to address key pain points in traditional AI orchestration:
-
• Inability to handle complex decision logic -
• Difficulty in achieving interaction between agents -
• Lack of contextual memory and state management -
LangGraph solves these issues through a Directed Graph approach.
1. Install the Library
pip install langgraph
2. State Definition
class State(TypedDict):
messages: Annotated[Sequence[BaseMessage], add_messages] # Message history sequence
language: str # Conversation language setting
The State class is used for state maintenance within the LangGraph graph, defined using TypedDict to outline the conversation state:
-
• messages: Stores conversation history, using Annotated to support message addition -
• language: Specifies the response language, supporting multilingual switching.
3. Core Component Initialization
class AIChat():
def __init__(self):
# Initialize language model
self.model = ChatOpenAI(model="gpt-4o-mini") # Using GPT-4 series model
# Set prompt template
self.prompt = ChatPromptTemplate.from_messages([
(
"system", # System role prompt
"You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
),
MessagesPlaceholder(variable_name="messages"), # User message placeholder
])
# Initialize memory system
self.memory = MemorySaver() # For saving conversation history
# Configure message trimmer
self.trimmer = trim_messages(
max_tokens=65, # Maximum token limit
strategy="last", # Retain the last message
token_counter=self.model, # Use the model's token counter
include_system=True, # Include system messages
allow_partial=False, # No partial messages allowed
start_on="human", # Start with human messages
)
The initialization method sets up four key components:
-
1. Language model: Selects the GPT-4 series model as the core of the conversation -
2. Prompt template: Defines the role and behavior of the AI assistant -
3. Memory system: Manages conversation history -
4. Message trimmer: Prevents context from becoming too long, maintaining conversation efficiency
4. Model Invocation Process
def call_model(self, state: State):
# Combine prompt template and model
chain = self.prompt | self.model
# Trim message history
trimmed_messages = self.trimmer.invoke(state["messages"])
# Call model to get response
response = chain.invoke(
{"messages": trimmed_messages, "language": state["language"]}
)
# Return model response
return {"messages": [response]}
This method implements the complete model invocation process:
-
1. Create processing chain: Connect the prompt template to the model -
2. Message trimming: Ensure not to exceed token limits -
3. Model invocation: Pass the processed messages and language settings -
4. Response handling: Wrap the model response for return
5. Conversation Graph Construction
def create_graph(self):
# Create state graph
workflow = StateGraph(state_schema=State)
# Add edges from start to model
workflow.add_edge(START, "model")
# Add model node
workflow.add_node("model", self.call_model)
# Compile graph and add memory functionality
app = workflow.compile(checkpointer=self.memory)
return app
This method constructs the workflow for the conversation system:
-
1. Create graph: Use the State class as the state schema -
2. Define flow: Processing flow from start to model -
3. Add functionality: Use the call_model method as a processing node -
4. Complete construction: Compile the graph and integrate the memory system
6. Interaction Interface
def chat(self):
# Configure conversation settings
config = {"configurable": {"thread_id": "abc123"}} # Conversation thread ID
language = "Chinese" # Set response language
# Create an instance of the conversation graph
app = self.create_graph()
# Start interaction loop
while True:
# Get user input
query = input("Please enter your question:")
input_messages = [HumanMessage(query)]
# Stream response processing
for chunk, metadata in app.stream(
{"messages": input_messages, "language": language},
config,
stream_mode="messages",
):
# Only output the AI response part
if isinstance(chunk, AIMessage):
print(chunk.content, end="")
print('\n')
The interaction interface implements the following functionalities:
-
1. Configuration initialization: Set conversation ID and language -
2. Graph instantiation: Create the conversation processing graph -
3. Interaction loop: Continuously receive user input -
4. Streaming output: Display AI responses in real-time
7. Chat Examples
Please enter your question: Who am I?
You are a questioner wanting to know more information. If you are willing, you can tell me more about yourself, and I will do my best to help you!
Please enter your question: I am qqq
Hello, QQQ! Nice to meet you. How can I assist you?
Please enter your question: Who am I?
You are QQQ! If you have any other questions or want to chat about something, feel free to let me know!
Please enter your question: Who am I?
You are the one asking the question; I don’t know who exactly you are. If you are willing, you can tell me more about yourself!
At first, the large model does not know who I am. After telling it, due to the addition of memory, it can answer. However, since we used a trimmer to remove previous historical messages, it cannot answer again. The result meets expectations.
8. Complete Code
from langchain_openai import ChatOpenAI
from langchain_core.messages import BaseMessage
from langchain_core.messages import HumanMessage, SystemMessage, trim_messages, AIMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph
from langgraph.graph.message import add_messages
from typing import Sequence
from typing_extensions import Annotated, TypedDict
# Define the state in the graph
class State(TypedDict):
messages: Annotated[Sequence[BaseMessage], add_messages]
language: str
class AIChat():
def __init__(self):
# Define model
self.model = ChatOpenAI(model="gpt-4o-mini")
# Define prompt template
self.prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
),
MessagesPlaceholder(variable_name="messages"),
]
)
# Define memory
self.memory = MemorySaver()
# Define trimmer
self.trimmer = trim_messages(
max_tokens=65,
strategy="last",
token_counter=self.model,
include_system=True,
allow_partial=False,
start_on="human",
)
# Define function to call the model
def call_model(self, state: State):
chain = self.prompt | self.model
trimmed_messages = self.trimmer.invoke(state["messages"])
response = chain.invoke(
{"messages": trimmed_messages, "language": state["language"]}
)
return {"messages": [response]}
# Create graph
def create_graph(self):
# Define graph
workflow = StateGraph(state_schema=State)
# Define nodes and edges
workflow.add_edge(START, "model")
workflow.add_node("model", self.call_model)
# Add memory
app = workflow.compile(checkpointer=self.memory)
return app
# Chat
def chat(self):
# Changing configuration will reset the conversation, losing previous memory.
config = {"configurable": {"thread_id": "abc123"}}
language = "Chinese"
app = self.create_graph()
while True:
query = input("Please enter your question:")
input_messages = [HumanMessage(query)]
for chunk, metadata in app.stream(
{"messages": input_messages, "language": language},
config,
stream_mode="messages",
):
if isinstance(chunk, AIMessage): # Filter to just model responses
print(chunk.content, end="")
print('\n')
if __name__ == "__main__":
AIChat().chat()
Recommended Reading
-
FastAPI Introduction Series Collection
-
Django Introduction Series Collection
-
Flask Tutorial Series Collection
-
tkinter Tutorial Series Collection
-
Flet Tutorial Series Collection
Please open in WeChat client