Hello everyone! This is a channel focused on AI agents~
Have you ever wondered how those AI “agents” that can autonomously perform tasks and call tools operate? In 2024, we witnessed the transformation of AI technology from simple chatbots to more complex agents. However, as we delve into these agents, we find that their underlying technology stack is vastly different from the LLM technology stack we are familiar with.
Today, we will unveil the mystery of AI agent technology, outlining this rapidly evolving field so that you are no longer in the dark.
From LLM to Agent: A Profound Evolution
In 2022 and 2023, we witnessed an explosion of LLM frameworks and SDKs, such as LangChain and LlamaIndex. Meanwhile, the use of LLMs became more convenient, whether through API calls or self-deployment (like vLLM and Ollama).
However, by 2024, everyone’s focus began to shift towards more advanced AI “agents”. This concept has existed in the AI field for a long time, but in the era of ChatGPT, it has taken on new meaning: LLMs that can act autonomously, execute tasks, and interact with external tools.
This shift means we need a completely new technology stack to support the development of agents.
Agent Technology Stack: What Are the Core Differences?
Agents are not just large models that can chat; they are more like intelligent entities with a certain degree of autonomy. They need to manage their own state (such as conversation history and memory), call various tools, and execute them safely. This makes the technology stack of agents significantly different from that of traditional LLMs.
Let’s analyze the key components of the agent technology stack from the ground up:

1. Model Services: The Brain of AI

-
Core: LLM. This is the core driving force of the AI agent. -
Service Method: Provided through inference engines, usually via paid/self-deployed APIs. -
Main Players: -
Closed-source Models: OpenAI and Anthropic lead the way. -
Open-source Models: Providers like Together.AI, Fireworks, and Groq are starting to emerge, offering services based on models like Llama 3. -
Local Deployment: vLLM has become a mainstream choice for production-grade GPU services, while Ollama and LM Studio are favored by individual enthusiasts.
2. Storage: The Foundation of Memory

-
Core: Persistent states such as conversation history, memory, and external data. -
Key Technologies: -
Vector Databases: Chroma, Weaviate, Pinecone, Qdrant, and Milvus are used to store the agent’s “external memory” to handle large volumes of data. -
Traditional Databases: Postgres also begins to support vector searches through the pgvector extension. -
Why Important? Agents are stateful and require long-term storage and retrieval of information.
3. Tools and Libraries: Expanding Capabilities

-
Core: Tools (or “functions”) that enable agents to perform various tasks. -
Invocation Method: Functions and parameters are specified via structured outputs generated by LLMs (e.g., JSON objects). -
Safe Execution: Using sandboxes (like Modal and E2B) to ensure the safety of tool execution. -
Tool Ecosystem: -
General Tool Libraries: Composio, etc. -
Specialized Tools: Browserbase (web browsing), Exa (web searching), etc. -
Why Important? Tools expand the capabilities of agents, allowing them to complete more complex tasks.
4. Agent Framework: The Command Center for Intelligent Orchestration

-
Core: Responsible for orchestrating LLM calls and managing agent states. -
Key Features: -
State Management: How to save and load agent states, such as conversation history and memory. -
Context Window: How to “compile” state information into the LLM’s context window. -
Cross-Agent Communication: How to enable collaboration between multiple agents. -
Memory Management: How to deal with the limited context window of LLMs and manage long-term memory. -
Open-source Model Support: How to better leverage open-source models for agents. -
Popular Frameworks: Llama Index, CrewAI, AutoGen, Letta, LangGraph, etc. -
Why Important? Frameworks determine how agents operate and their efficiency.
5. Agent Hosting and Services: Trends for the Future

-
Core: Deploying agents as services, accessible via APIs. -
Current Pain Points: State management, safe tool execution, and scalable deployment are challenges. -
Future Outlook: Standardized Agents APIs will emerge, making agent deployment easier. -
Why Important? This will take agents from prototypes to real applications.
The Future Is Here: The AI Agent Technology Stack Is Rapidly Evolving
The overall agent technology stack is still very young, but it is developing at an astonishing speed. Future agents will be smarter and more autonomous, playing important roles across various industries.
Alright, that’s all I wanted to share today. If you are interested in building AI agents, don’t forget to like and follow~