Three Major Failure Modes of AI Agents: Planning, Tool, and Efficiency Issues

The unprecedented capabilities of foundational models have opened the door to developing previously unimaginable AI agent applications. These new capabilities finally enable us to develop autonomous, intelligent agents that can serve as our assistants, colleagues, and coaches. They can help us create websites, collect data, plan trips, conduct market research, manage customer accounts, automate data entry, prepare for interviews, interview candidates, negotiate deals, and more. The possibilities seem endless, and the potential economic value of these agents is immense.

https://huyenchip.com/2025/01/07/agents.html

Three Major Failure Modes of AI Agents: Planning, Tool, and Efficiency Issues

To design AI-friendly computational interfaces

The most worthwhile sections to read in this article are as follows:

Agent Failure Modes and Evaluation

Agent failure modes and evaluation

Evaluation is about detecting failures. The more complex the tasks executed by the agents, the more potential failure points there are. In addition to the common failure modes shared by all AI applications, AI agent applications also have unique failures due to planning, tool execution, and efficiency issues. Some failures are easier to detect than others.

To evaluate an AI agent, it is necessary to identify its failure modes and measure the frequency of each failure mode occurring.

1. Planning Failures

Planning is difficult and can fail in various ways. The most common planning failure mode is tool usage failure. The agent may generate a plan that contains one or more of the following errors:

Invalid Tool

For example, it generates a plan that includes bing_search, which is not on the tool list.

Valid Tool, Invalid Parameters

For example, it calls lbs_to_kg with two parameters, but this function only requires one parameter, lbs.

Valid Tool, Incorrect Parameter Values

For example, it calls lbs_to_kg with a parameter lbs but uses the value 100 instead of 120.

Another planning failure mode is goal failure: the agent fails to achieve its goals. This may be because the plan does not solve the problem or does not follow the constraints when solving the problem. To illustrate, imagine you ask the model to plan a two-week trip from San Francisco to India with a budget of $5000. The agent might plan a trip from San Francisco to Vietnam or plan a two-week trip from San Francisco to India that greatly exceeds the budget.

A commonly overlooked constraint in agent evaluation is time. In many cases, the agent requires less time because you can assign tasks to the agent and only check in when the tasks are completed. However, in many cases, the utility of the agent decreases over time. For example, if you ask the agent to prepare a grant proposal and the agent completes it after the grant deadline, then the agent is not very helpful.

An interesting pattern of planning failure is caused by reflective errors. The agent is convinced it has completed the task when it has not. For example, you ask the agent to assign 50 people to 30 hotel rooms. The agent may have only assigned 40 people and insists that the task is complete.

To evaluate the planning failures of the agent, one option is to create a planning dataset where each example is a tuple (task, tool list). For each task, generate K plans using the agent. Calculate the following metrics:

How many of the generated plans are valid?

For a given task, how many plans does the agent need to generate to obtain a valid plan?

How many of the tool calls are valid?

What is the frequency of invalid tools being called?

What is the frequency of valid tools being called with invalid parameters?

What is the frequency of valid tools being called with incorrect parameter values?

Analyze the agent’s output to look for patterns: on what types of tasks is the agent more likely to fail? Do you have hypothesized reasons? What tools does the model often make mistakes with?

Some tools may be more difficult for the agent to use. You can improve the agent’s ability to use difficult tools through better prompts, more examples, or fine-tuning. If all of this fails, you may need to consider replacing the tool with one that is easier to use.

2. Tool Failures

Tool failures occur when the correct tool is used, but the tool outputs incorrect results. One failure mode is when the tool outputs an erroneous result. For example, an image caption generator returns an incorrect description, or an SQL query generator returns an erroneous SQL query.

If the agent only generates high-level plans and the translation module involves converting each planned action into executable commands, then failures may occur due to translation errors.

Tool failures depend on the tools. Each tool needs to be tested independently. Always print out each tool call and its output so you can check and evaluate them. If you have a translator, create benchmarks to evaluate it.

Detecting missing tool failures requires knowledge of which tools should be used. If your agent frequently fails in a specific domain, it may be due to a lack of tools in that domain. Collaborate with human domain experts to observe what tools they would use.

3. Efficiency Issues (Timeliness and Cost of Task Completion)

The agent may generate a valid plan to complete a task using the correct tools but may be inefficient. Here are some things you might want to track to assess the agent’s efficiency:

How many steps does the agent on average take to complete a task?

How much does the agent on average cost to complete a task?

How long does each action typically take?

Are there any particularly time-consuming or costly actions?

You can compare these metrics with your baseline, which could be another agent or a human operator. When comparing AI agents and human agents, remember that the operating modes of humans and AI are very different, so what is efficient for humans may be inefficient for AI, and vice versa. For example, visiting 100 web pages may be inefficient for a human agent who can only visit one page at a time but may be trivial for an AI agent that can visit all pages simultaneously.

The development trajectory of AI agents will inevitably lead to higher autonomy, and the probability of the aforementioned three types of failures will also increase. The improvement in foundational model capabilities can mitigate some of these failures, while the remaining ones will rely on engineering applications to address them.

Agent Failure Modes and Evaluation

Leave a Comment Cancel reply