Mastering LangGraph-Memory: A Comprehensive Guide

Mastering LangGraph-Memory: A Comprehensive Guide
LangGraph allows you to easily manage conversation memory in graphs. These operational guides demonstrate how to implement various strategies for this.
Managing Conversation History
One of the most common use cases for persistence is using it to track conversation history. It makes continuing conversations easier. However, as conversations get longer, this conversation history accumulates and takes up more and more of the context window.
This is often undesirable as it can lead to more expensive and longer calls to the LLM, and may introduce errors. To prevent this, you may need to manage the conversation history.
Note: This guide focuses on how to perform this in LangGraph, and you can fully customize how to do this. If you want a more out-of-the-box solution, you can check the corresponding features provided in LangChain messages.
First, prepare a simple graph agent.
from typing import Literal
from langchain_anthropic import ChatAnthropic
from langchain_core.tools import tool
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import MessagesState, StateGraph, START, END
from langgraph.prebuilt import ToolNode

memory = MemorySaver()

@tool
def search(query: str):
    """Call to surf the web."""
    return "It's sunny in San Francisco, but you better look out if you're a Gemini 😈."

tools = [search]
tool_node = ToolNode(tools)
model = ChatAnthropic(model="claude-3-haiku-20240307")
bound_model = model.bind_tools(tools)

def should_continue(state: MessagesState):
    last_message = state["messages"][-1]
    if not last_message.tool_calls:
        return END
    return "action"

# Define the function that calls the model
def call_model(state: MessagesState):
    response = bound_model.invoke(state["messages"])
    return {"messages": response}

workflow = StateGraph(MessagesState)
workflow.add_node("agent", call_model)
workflow.add_node("action", tool_node)
workflow.add_edge(START, "agent")
workflow.add_conditional_edges(
    "agent",
    should_continue,
    ["action", END],
)
workflow.add_edge("action", "agent")
app = workflow.compile(checkpointer=memory)
Then call it.
from langchain_core.messages import HumanMessage
config = {"configurable": {"thread_id": "2"}}
input_message = HumanMessage(content="Hello, I am Kirito")
for event in app.stream({"messages": [input_message]}, config, stream_mode="values"):
    event["messages"][-1].pretty_print()
input_message = HumanMessage(content="What is my name?")
for event in app.stream({"messages": [input_message]}, config, stream_mode="values"):
    event["messages"][-1].pretty_print()
================================ Human Message =================================
Hello, I am Kirito================================== Ai Message ==================================
Hello! Is Kirito a character name? From some game or anime? Nice to meet you, how can I assist you?
================================ Human Message =================================
What is my name?================================== Ai Message ==================================
You just told me your name is "Kirito". Is there anything else you need to know or confirm?
Filtering Messages
To prevent the conversation history from exploding, the most straightforward approach is to filter the message list before passing it to the LLM.
This involves two parts: defining a function to filter messages, and then adding it to the graph. See the example below, where a very simple filter_messages function is defined and then used.
The above code remains mostly unchanged, with only two modifications.
def filter_messages(messages: list):
    # This is a very simple helper function that only uses the last message
    return messages[-1:]

def call_model(state: MessagesState):
    # Note here: we filter messages before calling the model
    messages = filter_messages(state["messages"])
    response = bound_model.invoke(messages)
    return {"messages": response}
================================ Human Message =================================
Hello, I am Kirito================================== Ai Message ==================================
Hello! Which work is Kirito from? If you have any questions or topics to discuss about him, feel free to share with me.
================================ Human Message =================================
What is my name?================================== Ai Message ==================================
You are Qwen, an assistant developed by Alibaba Cloud. But what should I call you now? Please tell me your name!
As you can see, since we only retained the last message when filtering, the LLM does not know the context and therefore does not know my name.
Below are the out-of-the-box message filtering and trimming links from LangChain, readers can try them out:
https://python.langchain.com/docs/how_to/filter_messages/
https://python.langchain.com/docs/how_to/trim_messages/
How to Delete Messages
A common state of the graph is a list of messages. Usually, you only add messages to that state. However, sometimes you may want to delete messages (by directly modifying the state or as part of the graph). For this, you can use the modifier RemoveMessage.
The key idea is that each state key has a reducer key. This key specifies how to combine updates into the state. The default MessagesState has a messages key, and that key’s reducer accepts these RemoveMessage modifiers. Then, that reducer uses these RemoveMessage modifiers to remove messages from the key.
So note, just because your graph state has a key for a list of messages, it does not mean that this RemoveMessage modifier will work. You also must have a reducer that knows how to use its definition.
Note: Many models require the message list to follow certain rules. For example, some models require messages to start with a user message, while others require that all messages with tool calls are followed by a tool message. When deleting messages, you need to ensure you do not violate these rules.
The basic code remains unchanged, still the graph process above.
First, we will introduce how to manually delete messages. Let’s take a look at the current state of the thread:
[HumanMessage(content='Hello, I am Kirito', additional_kwargs={}, response_metadata={}, id='7054cbf0-e714-4b4c-b065-6a9eb75b7a2d'), AIMessage(content='Hello! Kirito is a character from the light novel "Sword Art Online", right? Do you have any topics or questions you want to discuss about this work or character?', additional_kwargs={}, response_metadata={'model': 'qwen2.5:7b', 'created_at': '2025-01-07T03:40:50.799446Z', 'done': True, 'done_reason': 'stop', 'total_duration': 2063392667, 'load_duration': 24295500, 'prompt_eval_count': 155, 'prompt_eval_duration': 247000000, 'eval_count': 40, 'eval_duration': 1788000000, 'message': {'role': 'assistant', 'content': 'Hello! Kirito is a character from the light novel "Sword Art Online", right? Do you have any topics or questions you want to discuss about this work or character?', 'images': None, 'tool_calls': None}}, id='run-d68ff793-1a32-49b1-a769-5b6e59c94357-0', usage_metadata={'input_tokens': 155, 'output_tokens': 40, 'total_tokens': 195}), HumanMessage(content='What is my name?', additional_kwargs={}, response_metadata={}, id='0a6eef9b-ddde-4a4d-90d3-414ddeda2ca9'), AIMessage(content='You just mentioned your name is "Kirito". What would you like me to call you now?', additional_kwargs={}, response_metadata={'model': 'qwen2.5:7b', 'created_at': '2025-01-07T03:40:52.339773Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1438884083, 'load_duration': 27642042, 'prompt_eval_count': 207, 'prompt_eval_duration': 163000000, 'eval_count': 28, 'eval_duration': 1242000000, 'message': {'role': 'assistant', 'content': 'You just mentioned your name is "Kirito". What would you like me to call you now?', 'images': None, 'tool_calls': None}}, id='run-e066f578-da75-4ad2-805e-b6fcbf5a7e2c-0', usage_metadata={'input_tokens': 207, 'output_tokens': 28, 'total_tokens': 235})]
We can call update_state and pass in the ID of the first message. This will delete that message.
from langchain_core.messages import RemoveMessage
app.update_state(config, {"messages": RemoveMessage(id=messages[0].id)})
messages = app.get_state(config).values["messages"]
print(messages)
[AIMessage(content='Hello! Kirito is a character from the light novel "Sword Art Online", right? Do you have any topics or questions you want to discuss about this work or character?', additional_kwargs={}, response_metadata={'model': 'qwen2.5:7b', 'created_at': '2025-01-07T03:43:39.974483Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1445857708, 'load_duration': 27660333, 'prompt_eval_count': 155, 'prompt_eval_duration': 138000000, 'eval_count': 29, 'eval_duration': 1276000000, 'message': {'role': 'assistant', 'content': 'Hello! Kirito is a character from the light novel "Sword Art Online", right? Do you have any topics or questions you want to discuss about this work or character?', 'images': None, 'tool_calls': None}}, id='run-cbbaeb2b-9bcb-43a7-9e02-fa86aea86924-0', usage_metadata={'input_tokens': 155, 'output_tokens': 29, 'total_tokens': 184}), HumanMessage(content='What is my name?', additional_kwargs={}, response_metadata={}, id='d3f38c66-0c7e-4f8e-97ce-108cc57ea47a'), AIMessage(content='You just mentioned your name is "Kirito". If you have any other questions or need assistance, please let me know.', additional_kwargs={}, response_metadata={'model': 'qwen2.5:7b', 'created_at': '2025-01-07T03:43:41.18703Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1142763958, 'load_duration': 12822958, 'prompt_eval_count': 196, 'prompt_eval_duration': 161000000, 'eval_count': 22, 'eval_duration': 964000000, 'message': {'role': 'assistant', 'content': 'You just mentioned your name is "Kirito". If you have any other questions or need assistance, please let me know.', 'images': None, 'tool_calls': None}}, id='run-4b322f02-3689-45a8-92d8-11783e7d7869-0', usage_metadata={'input_tokens': 196, 'output_tokens': 22, 'total_tokens': 218})]
We can verify that the first message has been deleted.
Deleting Messages Programmatically
We can also delete messages programmatically from within the graph. Here, we will modify the graph to delete all old messages (more than 3 messages prior) when the graph run ends.
from langchain_core.messages import RemoveMessage
from langgraph.graph import END

def delete_messages(state):
    messages = state["messages"]
    if len(messages) > 3:
        return {"messages": [RemoveMessage(id=m.id) for m in messages[:-3]]}

# We will modify the logic to call delete_messages instead of immediately ending
def should_continue(state: MessagesState) -> Literal["action", "delete_messages"]:
    last_message = state["messages"][-1]
    if not last_message.tool_calls:
        return "delete_messages"
    return "action"

# Define a new graph
workflow = StateGraph(MessagesState)
workflow.add_node("agent", call_model)
workflow.add_node("action", tool_node)
# New defined node
workflow.add_node(delete_messages)

workflow.add_edge(START, "agent")
workflow.add_conditional_edges(
    "agent",
    should_continue,
)
workflow.add_edge("action", "agent")
# Here we added a new edge
workflow.add_edge("delete_messages", END)
app = workflow.compile(checkpointer=memory)
This is equivalent to calling delete messages just before the final END, and of course, we can customize other delete nodes and timing. The results will not be displayed here.
Building Conversation Summaries
One of the most common use cases for persistence is using it to track conversation history. It makes continuing conversations easier. However, as conversations get longer, this conversation history accumulates and takes up more and more of the context window.
This is often undesirable as it can lead to more expensive and longer calls to the LLM, and may introduce errors. One solution to this problem is to create a summary of the conversation to date, which is slightly different from the previous ideas of filtering and deleting messages, and use it along with the past N messages.
  1. This will involve several steps: check if the conversation is too long (this can be done by checking the number of messages or the length of messages)
  2. If so, create a summary (which requires prompting)
  3. Then delete all messages except for the last N messages
A large part of this is deleting old messages.
from typing import Literal
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import SystemMessage, RemoveMessage, HumanMessage
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import MessagesState, StateGraph, START, END

memory = MemorySaver()

# We will add a `summary` attribute (in addition to the `messages` key that MessagesState already has)
class State(MessagesState):
    summary: str

model = ChatAnthropic(model_name="claude-3-haiku-20240307")

# Define the logic to call the model
def call_model(state: State):
    # If a summary exists, we will add it as a system message
    summary = state.get("summary", "")
    if summary:
        system_message = f"Summary of conversation earlier: {summary}"
        messages = [SystemMessage(content=system_message)] + state["messages"]
    else:
        messages = state["messages"]
    response = model.invoke(messages)
    return {"messages": [response]}

# Now we define the logic to determine whether to end or summarize the conversation
def should_continue(state: State) -> Literal["summarize_conversation", END]:
    """Return the next node to execute."""
    messages = state["messages"]
    # If there are more than six messages, we summarize the conversation
    if len(messages) > 6:
        return "summarize_conversation"
    # Otherwise, we end
    return END

def summarize_conversation(state: State):
    # First, we summarize the conversation
    summary = state.get("summary", "")
    if summary:
        # If a summary already exists, we use a different system prompt to summarize it
        summary_message = (
            f"This is summary of the conversation to date: {summary}\n\n"
            "Extend the summary by taking into account the new messages above:"
        )
    else:
        summary_message = "Create a summary of the conversation above:"
    messages = state["messages"] + [HumanMessage(content=summary_message)]
    response = model.invoke(messages)
    # We now need to delete messages we no longer want to display
    # I will delete all but the last two messages, but you can change this setting
    delete_messages = [RemoveMessage(id=m.id) for m in state["messages"][:-2]]
    return {"summary": response.content, "messages": delete_messages}

# Define a new graph
workflow = StateGraph(State)
# Define conversation node and summary node
workflow.add_node("conversation", call_model)
workflow.add_node(summarize_conversation)
# Set the entry point to the conversation
workflow.add_edge(START, "conversation")
# We now add a conditional edge
workflow.add_conditional_edges(
    # First, we define the starting node. We use `conversation`.
    # This means these are the edges taken after calling the `conversation` node.
    "conversation",
    # Next, we pass in the function that will determine the next call.
    should_continue,
)
# We now add a regular edge from `summarize_conversation` to END.
# This means after calling `summarize_conversation`, we go directly to END
workflow.add_edge("summarize_conversation", END)
app = workflow.compile(checkpointer=memory)
This code is relatively clear, and we will not run it here; you can try it out yourself!
Additionally, for cross-thread storage, we have already covered this in the previous persistence section, so we will not repeat it. Basically, you need to add a store.
If you want to perform semantic search on memory, you can also use related embeddings and store implementations, and similarly refer to the previous persistence section.

Leave a Comment