In October 2024, the U.S. Center for Security and Emerging Technologies released a risk assessment report on artificial intelligence titled “Through the Chat Window and Into the Real World: Preparing for AI Agents”. A large number of startups and major tech companies are deploying, building, and selling AI agents, which will bring revolutionary changes to society and the economy. Previously, the U.S. Center for Security and Emerging Technologies held a seminar to discuss the latest advancements in building AI agents and the potential policy implications if this technology continues to develop. This report summarizes the key themes and conclusions of the seminar, explores the definition of AI agents, analyzes the opportunities and challenges they present, and proposes three categories of response measures.
1. Definition of AI Agents
Based on past research, this report argues that more proactive artificial intelligence systems should possess four characteristics. First is goal complexity, meaning that more proactive systems pursue complex, long-term goals, sometimes simultaneously pursuing multiple different goals, while less proactive systems execute individual, more clearly defined tasks. Second is environmental complexity, where more proactive systems can operate effectively in more open and complex environments, in which AI agents can change more states and take more actions, while less proactive systems can only operate effectively in simpler and more predictable environments. Third is independent planning and adaptation, where more proactive systems can generate their own plans or pathways to achieve predetermined goals and adapt to changing circumstances as needed. Less proactive systems follow pre-specified step-by-step instructions. Fourth is direct action, where high-agency systems can directly take actions in their environment, whether in real or virtual settings. Less proactive systems provide information or suggestions to human users, who must guide their actions.
2. Current Status of AI Agents
The new wave of interest in AI agents began in 2023. With the rapid development of large language models, such as open-source projects like AutoGPT and BabyAGI, there has been exploration into how to build software wrappers around large language models, transforming them from chatbots into AI agents. These AI agents differ from chatbots; they not only engage in conversation with users and suggest steps to achieve goals but can also be assigned a specific goal, such as, “Create a pitch document for a new startup idea and send it to five relevant investors,” and then generate and execute a plan to achieve that goal, including actions such as browsing the internet or running code. AI agents execute as requested until they achieve their goals or get stuck. These early large language model-based AI agents have not been practical due to high error rates, but they demonstrate an important point: the gap between generating text and taking action may be much smaller than previously imagined.
The basic architecture of these large language model-based AI agents includes what is called “scaffolding” software, which contains a large language model or multimodal model. This scaffolding is typically a simple software that acts as an interface between the model and the external world, automatically generating prompts for the model to use, such as instructions for operating a web browser or running code, and converting the model output into a format that can interact with external systems. In this way, the scaffolding enables large language models to interact with various online tools and services. As of mid-2024, many large AI companies are clearly committed to transforming their chatbots into AI agents. At the developer conference in May 2024, Microsoft committed to launching AI tools “that perform tasks as independent agents with more autonomy and less human intervention,” while Google showcased an upcoming AI agent capable of executing tasks autonomously across multiple applications. Startups like Adept, MultiOn, and Lindy have also raised hundreds of millions in funding, promising to build AI agents capable of flexibly executing complex tasks. However, currently, these AI agents have very limited functionality, and if they are likely to fail or get stuck at every step, the more steps required to complete a task, the lower the likelihood of success. If there is a high chance of failure at each step, AI agents may not be particularly useful at present, and the likelihood of widespread adoption without further progress seems low.
Moreover, there is uncertainty surrounding the development and popularity of AI agents. First, how quickly will the complexity and practicality of AI agents improve? Many researchers, corporate executives, and startup founders believe that the current unreliability of AI agents is only a temporary phenomenon. Several different potential research paths could significantly enhance the capabilities of AI agents. Second, are AI agents general-purpose or designed for specific use cases? Given the current performance limitations of AI agents, it may be more feasible in the short term to develop agents with established goals that are smaller in specific application environments. Third, which use cases are most likely to succeed first? Experts point out that in fields like software engineering, it is generally easier to train and fine-tune AI agent behavior, as building feedback loops or validation cycles is relatively easy in these areas. In addition to software engineering, experts also expect to see higher adoption rates in other purely software domains (like cybersecurity) and use cases with short feedback cycles (like customer support). Additionally, economic incentives may influence which types of AI agents gain widespread application. Fourth, what kind of business model and market structure will dominate? One possibility is that if large language model-based AI agents become widespread, a few large language model companies will sell licenses for the use of AI agents developed entirely in-house. Another possibility is a dual-layer structure, where some companies develop advanced large language models, while others build AI agents on top of them. Yet another possible structure is that if current leading large language model companies cannot maintain a competitive edge against open-source options, individuals and companies may use free AI agents. If future advancements in AI agents do not rely on large language models, other possibilities may arise.
3. Opportunities, Risks, and Other Impacts
Just as there are no clear and distinct boundaries between AI agents and other types of AI systems, the benefits and risks brought by AI agents are often a continuation of other AI issues. The opportunities presented by the development of AI agents are diverse, from improving operational efficiency in businesses to empowering individual users. In addition to the existing challenges of artificial intelligence, the characteristics of AI agents, such as the ability to act without human intervention, as well as the potential to establish personal relationships with users, may pose new potential harms, such as accidental incidents, misuse of technology, issues of liability distribution, data governance and privacy concerns, human skill degradation, technological dependence and vulnerability issues, illegal automated collusion by AI agents, and impacts on the social workforce.
4. Protective and Intervention Measures
The rise of AI agents presents many questions for policymakers and regulators, and the uncertainty surrounding whether, how, and when AI agents will become mainstream complicates these issues further. Technological, legal, and other protective measures will play a key role in managing these issues.
(1) Assessing AI Agents and Their Impact
Given the complexity of AI agents and the uncertainty of their usage growth, experts express interest in improving methods for assessing AI agent performance. If AI agents’ performance in different environments can be measured and monitored, it would make tracking and predicting their progress much easier, thereby preparing for their potential impact. Additionally, another direct way to assess AI agents is to collect ecosystem-level data to understand their impact. This is similar to ecological monitoring of environmental pollution to prevent manufacturers from neglecting to measure and manage their factory emissions. Furthermore, benchmarking tests, typically used to evaluate and compare the performance of different AI models, can serve as important tools for tracking the progress of AI agents.
(2) Technical Protective Measures
Researchers might consider implementing technical protective measures at three levels: model level, system level, and ecosystem level. The model level primarily concerns AI agents based on large language models, containing the underlying statistical models, which can be viewed as the “engine” of the AI agents. The system level includes the model and the scaffolding and other components built around the model that support interactions with users and external tools, specifying what actions it can and cannot take and recording its interactions. The ecosystem level encompasses the broader space in which AI agents interact, such as social media platforms.
(3) Legal Protective Measures
Even without new legislation or regulations, a wide range of existing laws may relate to managing the impact of AI agents. First, many existing laws can already apply to AI agents. For instance, the U.S. Federal Trade Commission states, “There is no exemption for artificial intelligence under existing laws concerning civil rights, fair competition, and consumer protection.” Similarly, AI agents deployed in highly regulated industries like healthcare and finance will be subject to the existing regulations of those industries. In other legal fields, provisions of agency law, corporate law, contract law, criminal law, tort law, property law, and insurance law will also play significant roles in cases involving AI agent systems. However, even if a particular law clearly corresponds to an issue, certain application details still need to be resolved. In the context of these existing legal frameworks, experts discussed factors related to AI agents, including mental state and legal personhood. Overall, to effectively regulate AI agents, there should be a creative application of existing legal principles while also developing new legal concepts.
5. Conclusion
Regarding the prospects of AI agents in the coming years, although many questions remain unanswered, experts have drawn several key conclusions. First, AI agents have generated significant attention and attracted substantial investment. Startups and major tech companies are striving to translate advancements in large language models into more proactive systems capable of flexibly and autonomously executing user goals, surpassing their current primary use as chatbots. Second, there is controversy surrounding the definition of AI agents. There are no clear boundaries to distinguish which AI systems are agents and which are not, but simply put, more proactive systems can autonomously take direct actions in more open environments and pursue more complex goals. Third, existing large language model-based AI agents have limited functionality and often make mistakes. Fourth, if highly complex AI agents are widely applied, they may exacerbate existing issues and introduce new ones. Fifth, the trajectory of AI agent development is difficult to assess, making it hard to determine whether AI agents will quickly advance in complexity and widespread deployment or remain primarily in the research realm for the next few years. Sixth, many potential technical protective measures already exist, but their implementation sometimes requires trade-offs between conflicting goals, including between visibility and privacy, security and performance, and trust and practicality. Seventh, the legal status of AI agents and the applicability of existing laws are changing and raise serious legal issues; we may need to creatively apply existing concepts and update existing laws.