Best Practices for AI Agents in 2024: Anthropic Insights

The previous article introduced the basic concepts, mainstream architectures, and application frameworks of AI Agents. In this article, we begin to review the AI Agents of 2024, starting with the well-known best practices from Anthropic: Building Effective Agents.

For the original text, see [1]. This article summarizes the main content.

Over the past year, Anthropic has collaborated with dozens of teams across different industries to build Agents based on large language models (LLMs), summarizing their experiences in constructing effective Agents, which they shared in this article. Their key takeaway is: the most successful implementations do not use complex frameworks and libraries, but rather simple, composable patterns.

What Are Agents?

Different clients have different definitions; some define them as fully autonomous systems that can operate independently for extended periods, using various tools to complete complex tasks, while others define them as strictly implementing predefined workflows. Anthropic has made an important architectural distinction between Agents and workflows:

Workflows: Systems that coordinate LLMs and tools through predefined code paths. In other words, predefined processes are established, and then LLMs and a series of tools are used to implement them.
Agents: Systems that allow LLMs to dynamically guide their own processes and tool usage, controlling how tasks are completed. The core distinction is that tasks are dynamically decomposed and arranged by the LLM.

When to Use Agents

Always strive to find the simplest solution, adding complexity only when necessary. This means that an Agent system may not be needed at all, as using an Agent system for better performance can lead to greater latency and costs.

When demands are complex, workflows can provide predictability and consistency for clearly defined tasks. When more flexibility and model-driven decisions are required, Agents are a better choice. For many applications, optimizing a single LLM call through knowledge retrieval and in-context examples is usually sufficient.

When and How to Use Frameworks

Examples include LangChain’s LangGraph, Amazon’s Bedrock, and two interfaces for building workflows: Rivet and Vellum.

Frameworks can simplify underlying implementations, such as LLM calls and tool invocations, but they create more layers of abstraction, making debugging more difficult and turning originally simple problems into complex ones.

It is recommended to start with direct LLM API calls, which may require just a few lines of code. If using a framework, ensure understanding of the underlying code.

Building Modules, Workflows, Agents

Explore common Agent system patterns, from basic modules to more complex workflows, and finally to Agent systems.

Building Modules: Enhanced LLM

Best Practices for AI Agents in 2024: Anthropic Insights

The basic module of an Agent system enhances LLMs through knowledge retrieval, tool usage, and reading/writing memories.

It is recommended to focus on two aspects of implementation: 1) customizing features based on your use case, and 2) ensuring an easy-to-use, well-documented interface.

Workflow: Prompt Chaining

Prompt chaining tasks involve a series of sub-steps, where each step depends on the output of the previous one, with checks added between steps. The primary goal of this pattern is to break down tasks into multiple steps to ensure higher accuracy, although it may introduce greater latency.

Workflow: Routing

Classifying inputs into specialized follow-up tasks, each of which can be solved using specific prompts, is suitable for complex tasks with clear classifications. Handling them separately can yield higher accuracy and may improve performance.

Workflow: Parallelization

Dividing tasks into sub-tasks that can be executed in parallel is suitable for situations that can be decomposed into parallel tasks, either to accelerate processing or to allow a single task to be judged from multiple perspectives to increase confidence. Note the difference from the routing pattern: one uses solid arrows while the other uses dashed lines, with dashed lines indicating a choice of one execution and solid lines indicating that all are executed simultaneously.

Workflow: Orchestrator-Workers

An orchestrator LLM dynamically divides tasks into sub-tasks, which are executed by worker LLMs, and then synthesizes the results. The dynamic division is necessary as the tasks cannot be predicted in advance, which is a key difference from parallelization.

Workflow: Evaluator-Optimizer

One LLM is responsible for generating results, while another LLM evaluates and iterates until the evaluator accepts the results. This is suitable for situations with clear evaluation criteria.

Agents

The use of Agents in production environments is primarily due to the LLM’s ability to understand complex inputs, reason and plan, use tools, and recover from errors. As long as the tasks are clear, Agents can independently plan, execute, and return results. During execution, it is crucial for Agents to receive feedback from the environment and evaluate it, which can be based on human feedback or set conditions to stop. Designing a good toolset and providing thorough documentation is important.

Agents are suitable for open-ended questions and situations where fixed steps are hard to predict. LLMs may go through many iterations, and you need to trust the decisions made by the LLM to a certain extent.

Using Agents can incur higher costs due to frequent LLM calls. It is recommended to conduct tests in a sandbox environment first.

Below is an example where using an Agent is suitable for programming.

Summary

Achieving success in the LLM field is not about building the most complex system, but about constructing the right system that fits your needs. Start with simple prompts, use comprehensive evaluations to optimize, and only add multi-step Agent systems when simpler methods are insufficient.

Follow three core principles when implementing Agent systems:

Keep Agent design simple.
Prioritize transparency by clearly displaying the Agent’s planned steps.
Carefully design the agent-computer interface (ACI) with comprehensive tool documentation and testing.

References:

[1] Anthropic: Building Effective Agents https://www.anthropic.com/research/building-effective-agents

Leave a Comment Cancel reply