Embracing a New Era of Agentic Systems in 2025

Anthropic recently published a blog post stating:

“2025 will be the year of Agentic systems.”

1. What Are Agentic Systems?

“Agentic systems” refer to artificial intelligence systems that possess autonomous decision-making and action capabilities. These systems can adapt to complex environments and pursue intricate goals with limited human supervision. Although the concepts of autonomy and agency can be traced back to early philosophical discussions, in the field of artificial intelligence, OpenAI clearly defined and elaborated on this concept in its December 2023 white paper, “Practices for Governing Agentic AI Systems.” On December 20, 2024, Anthropic conducted an in-depth exploration and empirical summary of Agentic systems in its research.

2. What Is an Agent? The Difference Between Workflow and Agent

“Agent” is a polysemous term. Some define it as a fully autonomous system capable of running independently for extended periods and using various tools to accomplish complex tasks. Others define it as a tool that follows a certain workflow. In Anthropic’s definition, these all fall under the category of Agentic systems. Furthermore, an important architectural distinction is made between Agents and workflows, with the core difference being autonomy:

Workflow: Coordinates LLMs (large language models) and tools through predefined code paths. Suitable for clear and fixed tasks.
Agent: Dynamically guides its own processes, flexibly uses tools, and autonomously decides how to complete tasks. Suitable for scenarios requiring large-scale flexibility and model-driven decision-making.

In other words, an Agent can adjust its path based on environmental feedback during task execution, while a workflow resembles a fixed highway.

3. What Are Frameworks? What Are the Mainstream Frameworks?

Many frameworks can simplify the implementation of Agentic systems. For example, the following are frameworks for building and managing AI agents and workflows, each with unique functions and advantages.

LangChain’s LangGraph

Website: https://langchain-ai.github.io/langgraph/

Amazon Bedrock’s AI Agent Framework

Website: https://aws.amazon.com/bedrock/agents/

Rivet, a drag-and-drop graphical user interface (GUI) LLM Workflow Builder

Website: https://rivet.ironcladapp.com/

Vellum, another GUI tool for building and testing complex workflows:

Website: https://www.vellum.ai/

These frameworks help developers get started quickly by simplifying routine low-level tasks (like calling LLMs, defining and parsing tools, and chaining calls). However, they often add additional layers of abstraction, which may obscure the underlying prompts and responses, making debugging more challenging. They may also tempt developers to add unnecessary complexity when simple setups would suffice.

Anthropic recommends developers use LLM APIs directly: many patterns can be implemented with just a few lines of code. If using frameworks, ensure to understand the underlying code. Misassumptions about the underlying implementation are common mistakes among clients.

4. What Are Common Agentic Systems Patterns? How Should Building Blocks, Workflows, and Agents Be Combined?

In the blog post, Anthropic explored common patterns of Agentic systems in production environments. Anthropic begins with the basic building blocks—enhanced LLMs—and gradually increases complexity, from simple composite workflows to autonomous Agents.

4.1 Building Blocks: Enhanced LLM

The basic building block of Agentic systems is the enhanced LLM, which utilizes features such as retrieval, tools, and memory. Anthropic’s current models can proactively use these features—generating their own search queries, selecting appropriate tools, and determining what information to retain.

Embracing a New Era of Agentic Systems in 2025

Anthropic suggests focusing on two key aspects of implementation: customizing these features for specific use cases and ensuring they provide a simple, well-documented interface for the LLMs we use. While there are many ways to implement these enhancements, one approach is through Anthropic’s recently released model context protocol, which allows developers to integrate with the growing ecosystem of third-party tools via a simple client implementation.

4.2 Workflows: Prompt Chaining

Prompt chaining breaks tasks down into a series of steps, with each LLM call processing the output of the previous call. Programmatic checks (see “gate” in the diagram below) can be added at any intermediate step to ensure the process remains on track.

When to use this workflow: This workflow is ideal for situations that can be easily and cleanly decomposed into fixed subtasks. The main goal is to trade off latency for higher accuracy by making each LLM call a simpler task.

Useful examples of prompt linking:

Generating marketing copy and then translating it into another language.
Writing a document outline, checking if the outline meets certain criteria, and then writing the document based on the outline.

4.3 Workflows: Routing

Routing classifies inputs and directs them to specialized downstream tasks. This workflow allows for separation of concerns and builds more specialized prompts. Without this workflow, optimizing for one input may harm the performance of others.

When to use this workflow: Routing is suitable for cases where different categories are best handled separately and can be accurately processed by LLMs or more traditional classification models/algorithms.

Useful examples of routing:

Directing different types of customer service inquiries (general questions, refund requests, technical support) to different downstream processes, prompts, and tools.
Routing simple/common questions to smaller models like Claude 3.5 Haiku and routing difficult/unusual questions to more powerful models like Claude 3.5 Sonnet to optimize cost and speed.

4.4 Workflows: Parallelization

LLMs can sometimes handle a task simultaneously and programmatically aggregate their outputs. This workflow (parallelization) manifests in two main variants:

Segmentation: Breaking a task down into independently running subtasks.
Voting: Running the same task multiple times to obtain different outputs.

When to use this workflow: Parallelization is effective when subtasks can be decomposed for speed or when multiple perspectives or attempts are needed for higher confidence results. For complex tasks with multiple considerations, LLMs typically perform better when each consideration is handled by a separate LLM call, allowing focus on each specific aspect.

Useful examples of parallelization:

Segmentation: Implementing guardrails where one model instance handles user queries while another filters whether they contain inappropriate content or requests. This often performs better than having the same LLM call handle both the guardrail and core response simultaneously.
Automated evaluations to assess LLM performance, where each LLM call evaluates different aspects of the model’s performance under a given prompt.
Voting: Reviewing a piece of code for vulnerabilities, where several different prompts review the code and flag it when issues are found.
Evaluating whether given content is inappropriate, where multiple prompts assess different aspects or require different voting thresholds to balance false positives and negatives.

4.5 Workflows: Orchestrator-Worker

In the orchestrator-worker workflow, a central LLM dynamically decomposes tasks, delegates them to worker LLMs, and synthesizes their results.

When to use this workflow: This workflow is particularly suitable for complex tasks where the required subtasks cannot be predicted (for example, in coding, the number of files needing changes and the nature of changes in each file may depend on the task). While structurally similar, its main distinction from parallelization is its flexibility—subtasks are not predefined but determined by the orchestrator based on specific inputs.

Useful examples of orchestrator-worker:

Encoding products that involve complex changes to multiple files each time.
Involving the collection and analysis of information from multiple sources to find potential issues.

4.6 Workflows: Evaluator-Optimizer

In the evaluator-optimizer workflow, one LLM call generates a response while another LLM call provides evaluations and feedback in a loop.

When to use this workflow: This workflow is particularly effective when we have clear evaluation criteria and iterative improvements provide measurable value. Two good indicators of a good match are: first, when humans express their feedback clearly, LLM responses can significantly improve; second, LLMs can provide such feedback. This is similar to the iterative writing process a human writer might experience when polishing a document.

Useful examples of evaluator-optimizer:

Literary translation, where nuances that the translation LLM may initially miss can be captured by the evaluator LLM providing useful commentary.
Complex search tasks requiring multiple rounds of searching and analysis to gather comprehensive information, where the evaluator decides if further searches are needed.

4.7 Agents

As LLMs mature in key functionalities—understanding complex inputs, reasoning and planning, reliably using tools, and recovering from errors—Agents are emerging in production. Agents begin their work at the command of human users or through interactive discussions. Once the task is clear, the Agent independently plans and operates, and may return to humans for more information or judgment. During execution, the Agent must obtain “real-time” feedback from the environment at each step (e.g., results of tool calls or code execution) to assess its progress. The Agent can pause for human feedback at checkpoints or when encountering obstacles. Tasks typically terminate upon completion but often include stopping conditions (e.g., maximum iteration count) to maintain control.

Agents can handle complex tasks, but their implementation is often straightforward. They typically use LLMs that utilize tools based on environmental feedback in a loop. Therefore, it is crucial to design the toolset and its documentation clearly and thoughtfully. We expand on best practices for tool development in Appendix 2 (“Prompt Engineering Your Tools”).

When to use Agents: Agents can be used for open-ended questions where the number of required steps is difficult or impossible to predict, and you cannot hard-code a fixed path. LLMs may run many iterations, and you must have a certain level of trust in their decision-making. The autonomy of Agents makes them particularly suitable for scaling tasks in trusted environments.

Useful examples of Agents: The following examples come from Anthropic’s own implementations:

An encoding Agent for solving the SWE-bench task involving editing multiple files based on task descriptions;
Our “computer use” reference implementation, where Claude uses a computer to complete tasks.

4.8 Combining and Customizing These Patterns

These building blocks are not prescriptive. They are common patterns that developers can shape and combine to fit different use cases. As with any LLM functionality, the key to success is measuring performance and iterating on implementation. Once again, it should be emphasized that complexity should only be considered when it can significantly improve outcomes.

5. Analogy: How Do COZE and the Above Concepts Correspond?

Coze is an AI chatbot and application development platform launched by ByteDance, aimed at helping users quickly create and deploy AI applications. According to the provided information, Coze offers various features, including plugins, knowledge bases, long-term memory, scheduled tasks, and workflows. These features correspond to the building blocks, workflows, and agent patterns discussed earlier. According to Shallow Autumn’s understanding, it is as follows; if there are different views, feel free to discuss on WeChat:

Building Blocks:

Plugins: Coze integrates a wealth of plugin tools, allowing users to extend the capabilities of the bot by adding plugins for information reading, travel, efficiency, and more.
Knowledge Base: Coze provides knowledge base functionality, supporting various content formats and upload methods, allowing users to store and manage data so that AI can answer questions based on specific data.
Long-term Memory: Through database memory capabilities, Coze allows AI Bots to persistently remember important parameters or content from conversations, providing more personalized services.

Workflows:

Workflow Design: Coze’s workflow functionality provides numerous flexible and combinable nodes, including large language models (LLMs), custom code, and judgment logic. Users can quickly build workflows by dragging and dropping to handle logically complex tasks with high stability requirements.

Agents:

Agents: Coze allows users to create chatbots or agents focused on specific functions and domains, which can perform specific tasks based on user needs and share with other users.

Through these features, Coze enables users to build AI applications ranging from simple to complex, meeting various business needs. Whether enhancing AI capabilities through building blocks like plugins and knowledge bases, implementing complex task flows through workflow design, or creating autonomous agents, Coze provides the corresponding support.

Reference Article:https://www.anthropic.com/research/building-effective-agents

Recommended Reading: “A Guide to Understanding What LLM-based AI Agents Are”

—— Thank you for reading this far ——

The future will not be replaced by AI, but by those who can use AI. Feel free to scan the QR code to add me on WeChat. Let’s discuss the consensus and understanding of the AI era and keep up with this wave of AI.