Understanding the MemGPT Design of aZent Framework

Before getting into the main topic, let’s chat a bit. Many friends have seen me coding in my spare time. Sometimes I wonder why I do this; actually, my love for code has reached a somewhat obsessive level.

There might be some bragging involved. Without further ado, everything has been revolving around agents lately, and today is no exception; we will still talk about agents. When I learn a new concept or technology, I always want to try to implement it simply. For me, this might be a good learning method.

Today we will discuss the thoughtful MemGPT. Why is it called a thoughtful MemGPT? There are mainly two reasons. One reason is that the design goal of the MemGPT framework is to enable the LLM to have memory, paying attention to and remembering key information from user conversations, thereby providing users with a smooth conversational experience. The other reason is that MemGPT maintains a heartbeat throughout the entire communication process, actively thinking, unlike traditional agents that remain indifferent.

The image above is drawn based on a bit of MemGPT source code that I reviewed, and I will briefly explain it here.

Initialization Process

User conversations are accompanied by personas, which extract information from the dialogue for storage. The storage location and method for this information should be the most researched and learned part of MemGPT, which is how to effectively utilize resources for effective memory. Here, memory is divided into two levels, with different storage locations for different levels of memory. Interestingly, these designs are inspired by computer operating systems’ memory and virtual memory.

First, we need to initialize the user and persona by reading the previously prepared text describing their personas to start the conversation. The interface is the connection between the user and the agent, currently represented by input and output in the terminal, while persistent refers to setting up the medium for user memory storage.

Memory is divided into several levels: coreMemory, DummyRecallMemory, and DummyArchivalMemory. Memory is layered from shallow to deep, gradually deepening the memory. However, coreMemory is the memory most closely related to the current conversation, while DummyArchivalMemory is deep memory. These memories are not static; they can be updated and extracted during conversations, meaning that memory is editable.

Leave a Comment Cancel reply