Source: ZHUAN ZHI
This article is approximately 6000 words long and suggests a reading time of 12 minutes.
This article provides a comprehensive overview of collaboration in multi-agent systems and proposes a scalable framework to guide future research.
With the latest advancements in large language models (LLMs), Agentic Artificial Intelligence (Agentic AI) has made significant progress in real-world applications, moving towards agents based on multiple large language models that achieve perception, learning, reasoning, and collaborative actions. These multi-agent systems (MASs) based on large language models enable a group of agents to collaborate on complex tasks and achieve collective action on a large scale, shifting from isolated models to a collaboration-centric approach.
This article provides a comprehensive overview of collaboration in multi-agent systems and proposes a scalable framework to guide future research. Our framework characterizes collaboration mechanisms based on key dimensions: participants (the involved agents), types (e.g., cooperation, competition, or co-opetition), structures (e.g., peer-to-peer, centralized, or distributed), strategies (e.g., role-based or model-based), and coordination protocols. By reviewing existing methods, our findings lay the foundation for revealing and advancing the development of more intelligent and collaborative solutions for large language model-based multi-agent systems, especially in complex real-world applications.
Furthermore, this article explores various applications of multi-agent systems across different fields, including 5G/6G networks, Industry 5.0, question-answering systems, and socio-cultural environments, demonstrating their broad applications and far-reaching impacts. Finally, we summarize key lessons learned, analyze open challenges faced by multi-agent systems, and point out potential research directions towards artificial collective intelligence.
CCS Concepts:
-
General and Reference → Surveys and Overviews;
-
Computing Methodologies → Multi-Agent Systems; Natural Language Generation; Neural Networks.
Additional Keywords:Artificial Intelligence, Large Language Models, Multi-Agent Collaboration
1 Introduction
1.1 Motivation
The latest advancements in large language models (LLMs) have revolutionized artificial intelligence (AI), enabling it to perform complex tasks such as creative writing, reasoning, and decision-making at a human level [156]. While these models exhibit significant capabilities when used in isolation, they still face inherent limitations, such as the hallucination problem [57], autoregressive characteristics (e.g., inability to perform deep reasoning [49]), and scaling laws [55, 69]. To address these challenges, Agentic AI leverages LLMs as brains or coordinators, integrating them with external tools and tasks (such as planning), allowing LLM-based agents to take actions, solve complex problems, and interact with the external environment.
Moreover, researchers are increasingly exploring horizontal scaling—utilizing multiple LLM-based agents to achieve collective intelligence. This approach aligns with the research directions of multi-agent systems (MASs) and collaborative AI, which focus on enabling a group of agents to coordinate, share knowledge, and solve problems together. The convergence of these fields has given rise to LLM-based multi-agent systems, leveraging the collective intelligence of multiple LLMs to tackle complex multi-step challenges [118]. The inspiration for MASs comes not only from technological advancements but also from human collective intelligence (e.g., social mind [87], theory of mind [45]). Human societies excel at achieving shared goals through teamwork and specialization, which can be reflected in tasks ranging from everyday operations to scientific discoveries. Similarly, MASs aim to simulate these principles, enabling AI agents to collaborate effectively by combining their respective strengths and perspectives.
LLM-based MASs can collaborate through different channels that have distinct characteristics, as shown in Figure 1. MASs have achieved significant success across various domains by enhancing the capabilities of individual LLMs through collaboration and coordination among agents. These systems allocate tasks to agents, allowing them to share knowledge, execute subtasks, and align their efforts with shared goals. The potential advantages of MASs are transformative. They excel at knowledge retention, enabling distributed agents to maintain and share diverse knowledge bases without overloading a single system [51, 154]. They enhance long-term planning by distributing tasks among different agents, supporting continuous problem-solving in ongoing interactions [58]. Furthermore, MASs can achieve more effective generalization by aggregating expertise and prompts/roles from multiple models, solving a variety of problems more efficiently than a single model. Finally, MASs improve interaction efficiency by managing subtasks simultaneously through specialized agents, accelerating the resolution of complex multi-step tasks. MASs strive to achieve collective intelligence, where the combined capabilities of multiple agents exceed the sum of their individual contributions [24].
A major focus of effective MASs is the collaboration mechanisms [33, 74, 75, 97, 132] that facilitate the transition from traditional isolated models to interaction-focused approaches, enabling agents to connect, negotiate, make decisions, plan, and execute tasks together, thereby advancing AI capabilities in collective environments. A deep understanding of collaboration mechanisms in MASs is key to unlocking their full potential.
1.2 Current Status and Contributions
Due to the significance and urgent need for LLM-based multi-agent collaboration systems, some review articles have discussed this topic. However, these works often fail to adequately explore the collaborative aspects and mechanisms of LLM-based MASs, which are crucial for enabling agents to collaborate efficiently towards shared goals. A specific summary can be found in Table 1. For instance, [47, 107, 136] primarily focus on single-agent systems and only mention multi-agent collaboration at a superficial level.
[136] laid the groundwork for LLM agents by proposing a framework composed of three parts: brain, perception, and action. This work emphasizes the use of LLMs as the brain of agents, utilizing techniques such as input modality integration, prompting, retrieval, and tool use. However, their discussion of multi-agent collaboration is limited to agent behavior and personality, without delving into the mechanisms that enable agents to work together. [47] reviewed the fields and scenarios of successful LLM-based MAS applications, outlining the communication structures of these systems (hierarchical, decentralized, centralized, and shared message pools), but did not address other features of collaboration, such as types, strategies, or coordination architectures.
Other works, such as [82], focus on collaboration strategies, categorizing them as merging, integrating, and cooperating. Although their review discusses how these strategies apply to LLMs and extends cooperation beyond traditional fusion techniques, they overlook other important collaboration mechanisms such as competition and co-opetition, as well as dimensions that go beyond popular collaboration types. In contrast, [120] proposed a general framework for enhancing LLM capabilities, demonstrating how tools like Auto-GPT interface with this framework. However, collaboration mechanisms remain at a conceptual level, lacking detailed implementation and characterization. The research in [50] focuses on configuring LLMs to leverage diverse capabilities and roles, such as integrating memory and information retrieval components. Their exploration of multi-agent collaboration mainly centers on planning and coordination architectures, emphasizing global and local task planning based on agent roles and specialization. Similarly, the work in [46] narrowed its focus to the application of LLM-based MAS in agent-based modeling and simulation, discussing challenges related to environmental perception, human alignment, action generation, and evaluation. While these works provide insights for simulation-specific applications, they lack a broad perspective on deep collaboration mechanisms. Likewise, [68] reviewed the applications of these systems in digital twin applications, while [52, 70] focused on the software engineering domain.
From the above summary and explanation, it is evident that there are significant gaps in thoroughly exploring the collaborative aspects and mechanisms of LLM-based MASs, which are crucial for enabling agents to collaborate effectively towards shared goals. This work aims to provide a comprehensive perspective on the foundations of inter-agent collaboration in LLM-based multi-agent collaboration systems. With collaboration as the main focus, our research characterizes inter-agent collaboration in LLM-based agents as: participants (the involved agents), types (such as cooperation, competition, or co-opetition), structures (such as peer-to-peer, centralized, or distributed), strategies (such as role-based, rule-based, or model-based), and coordination layers. We emphasize the mechanisms and “tricks” that facilitate effective collaboration, identifying key features and trends in MAS design. By reviewing existing methods and identifying open challenges, we synthesize these findings into a coherent framework. This framework lays the foundation for future research, promoting the integration of LLMs in MASs, paving the way for more adaptive, intelligent, and collaborative AI systems capable of tackling complex real-world applications.
Our main contributions are as follows:
-
Collaboration aspects and mechanisms in LLM-based MAS: We focus on the operational mechanisms of multi-agent collaboration based on LLMs, emphasizing the “tricks” necessary for achieving effective collaboration, including collaboration types, strategies, communication structures, and coordination architectures.
-
A general framework for LLM-based MAS: We propose a comprehensive framework that integrates various features of MAS, helping researchers understand, design, and develop multi-agent collaboration systems.
-
A review of real-world applications: We examine the practical applications of LLM-based MAS across different fields, highlighting their real-world applications, success stories, and limitations.
Discussion of lessons learned and open issues: We identify key challenges in the development of MASs, such as collective reasoning and decision-making, and outline potential research directions to address these challenges.
1.3 Structure of the Paper
The organization of this paper is as follows. Section 2 provides background information necessary for understanding this research, including an overview of LLMs, MASs, and collaborative AI. Section 3 introduces the foundational concepts of LLM-based multi-agent collaboration systems, emphasizing the key role of collaboration through mathematical notation. Next, Section 4 provides a comprehensive review of LLM-based multi-agent collaboration systems, categorizing them based on key collaborative features, including types, strategies, structures, and coordination and orchestration. Section 5 reviews the key applications of LLM-based multi-agent collaboration systems in industrial and academic fields. Section 6 discusses open issues and potential future research directions in this emerging research area. Finally, Section 7 concludes this paper.
2 Background Knowledge
2.1 Multi-Agent Systems (MAS)
Multi-Agent Systems (MAS) are computerized systems composed of multiple interacting agents. These agents possess autonomy, allowing them to perceive the environment, interact with other agents, and collaboratively solve complex tasks. The key components of MAS include:
-
Agents: The core participants in the system, possessing roles, capabilities, behavioral patterns, and knowledge models. The capabilities of agents include learning, planning, reasoning, and decision-making, which endow the agents and the entire system with intelligence.
-
Environment: The external world in which agents operate, which they can perceive and act upon. The environment can be simulated or physical spaces, such as factories, roads, power grids, etc.
-
Interaction: Agents communicate through a standard agent communication language. Agent interactions include cooperation, coordination, negotiation, etc., depending on the system’s requirements.
-
Organization: Agents can be controlled hierarchically or organized based on emergent behaviors.
The notable features of MAS include flexibility, reliability, self-organization, and real-time operation, making them effective solutions for solving complex tasks. By breaking down complex tasks into multiple subtasks, each handled by different agents, MAS can complete tasks with lower costs and higher reliability.
2.2 Large Language Models (LLMs)
Large Language Models (LLMs) are deep learning models based on the Transformer architecture, such as OpenAI’s GPT, Meta’s LLaMA, and Google’s Gemini series. These models possess strong language understanding and generation capabilities by being trained on large-scale text corpora. The scale of LLMs enables them to tackle tasks that have not been explicitly trained for, such as analogy reasoning and zero-shot learning.
Although LLMs perform exceptionally well in single-agent tasks, the complexity of coordination, communication, and decision-making increases in multi-agent environments, making the limitations of LLMs more apparent. For example, the cascading hallucination problem is particularly pronounced in multi-agent interactions, where one erroneous output can lead to a chain of erroneous reactions. However, researchers are exploring how to overcome these challenges by introducing structured workflows and consensus mechanisms, enabling LLMs to play a more significant role in multi-agent environments.
2.3 Collaborative AI
Collaborative AI refers to AI systems designed to collaborate with other AI agents or humans. Research directions in collaborative AI include multi-agent systems, human-computer interaction, game theory, and natural language processing. By integrating these technologies, collaborative AI has the potential to drive the development of new applications with profound economic and social impacts.
Collaboration is key to enabling AI agents to interact and cooperate with each other. Collaboration mechanisms are not limited to simple cooperation but also include advanced mechanisms such as competition and co-opetition. Collaborative AI promotes a shift from traditional isolated models to interaction-focused approaches, enabling agents to interact, negotiate, make decisions, plan, and act together, thereby enhancing AI capabilities in collective environments.
3 Concepts of Multi-Agent Collaboration
3.1 Agents and Collaborative Systems
In LLM multi-agent collaboration systems, agents can be mathematically represented as
-
Model
m: AI model, including its architecture (arch), specific memory (mem) of the agent, and optional adapters (adp). Adapters are adaptive intelligent modules that allow agents to acquire additional knowledge from other agents through mechanisms such as inference decoding and parameter-efficient adapters.m -
Objective
o: The goal or task of the agent, guiding its behavior within the system.o -
Environment
e: The environment or context in which the agent operates, typically limited by the number of tokens in the context window.e -
Input
x: The input perception of the agent, such as text or sensor data.x -
Output
y: The corresponding action or output of the agent, defined asy .y = m ( o , e , x )
3.2 Problem Definition
In LLM-driven MAS, collaboration among agents is crucial. Each collaboration has a communication channel
-
Task allocation: Assigning tasks to multiple agents based on their unique expertise and resources.
-
Collaboration mechanisms: Defining the collaboration mechanisms between agents to enable them to work together.
-
Decision-making: Decision-making among agents to achieve the final goal.
Collaboration channels are characterized by specific attributes, including participants (the involved agents), types (such as cooperation, competition, or co-opetition), structures (such as peer-to-peer, centralized, or distributed), and strategies (such as role-based, rule-based, or model-based). Differences in any attribute can lead to different collaboration channels.
4 Methodology
4.1 Types of Collaboration
4.1.1 Cooperation
Cooperation is the most common type of collaboration in LLM multi-agent systems. When agents align their individual goals
4.1.2 Competition
Competition occurs when the individual goals of agents
4.1.3 Co-opetition
Co-opetition is a combination of cooperation and competition, where agents collaborate on certain tasks while competing on others. Co-opetition mechanisms are particularly effective in scenarios requiring trade-offs and compromises.
4.2 Collaboration Strategies
4.2.1 Rule-based Protocols
Rule-based protocols strictly control interactions between agents through predefined rules, ensuring that agents coordinate actions within the constraints of the system. This strategy is very effective when task procedures are clear and changes are limited but lacks adaptability in the face of unexpected situations.
4.2.2 Role-based Protocols
Role-based protocols assign specific roles or divisions of labor to each agent, allowing them to focus on subtasks within their area of expertise. This strategy enhances the efficiency and structure of the system but may lead to conflicts or functional defects when role definitions are unclear.
4.2.3 Model-based Protocols
Model-based protocols provide decision-making flexibility in environments where input perceptions are uncertain. Agents make probabilistic decisions based on their perceptions of the environment, shared goals, and inherent uncertainties. This strategy performs excellently in dynamic environments but is complex to design and deploy, with higher computational costs.
4.3 Communication Structures
4.3.1 Centralized Structure
In a centralized structure, all agents are connected to a central agent, which is responsible for managing and coordinating interactions among agents. Centralized structures have advantages in resource allocation and task coordination, but the failure of the central node can lead to the collapse of the entire system.
4.3.2 Distributed Structure
In a distributed structure, control and decision-making are distributed among multiple agents, with each agent operating based on local information and limited communication. Distributed structures have advantages in system fault tolerance and scalability, but resource allocation efficiency may be lower, and communication overhead may be higher.
4.3.3 Hierarchical Structure
In a hierarchical structure, agents are organized by layers, with agents at each layer having different functions and permissions. Hierarchical structures have advantages in task allocation and resource management, but the failure of edge devices may lead to system failure.
4.4 Coordination and Orchestration
4.4.1 Static Architecture
Static architectures rely on domain knowledge and predefined rules to establish collaboration channels. This architecture has advantages in consistency of task execution and utilization of domain knowledge but lacks flexibility in dynamic environments.
4.4.2 Dynamic Architecture
Dynamic architectures adapt to changing environments and task requirements by managing agents or using adaptive mechanisms to allocate roles and define collaboration channels in real-time. Dynamic architectures excel at handling complex and dynamic tasks but have higher resource usage, and dynamic adjustments may fail.
5 Practical Applications
5.1 5G/6G Networks and Industry 5.0
The application of LLMs in 5G/6G networks and Industry 5.0 significantly enhances the performance of edge networks. For instance, the LLM-SC framework utilizes LLMs to model the semantic information of text, designing a semantic communication system based on LLMs that balances semantic-level and technical-level performance.
5.2 Question Answering and Natural Language Generation (QA/NLG)
The application of LLMs in question-answering and natural language generation tasks significantly enhances the capabilities of systems. For example, OpenAI’s Swarm framework introduces a “handoff” mechanism that allows multiple agents to collaborate seamlessly, improving system efficiency and adaptability.
5.3 Social and Cultural Domains
The application of LLMs and MASs in social and cultural domains demonstrates the potential of these systems to simulate human behavior, social dynamics, and cultural interactions. For instance, the CulturePark framework promotes cross-cultural understanding and reduces bias by simulating cross-cultural interactions.
6 Open Issues and Discussion
6.1 Implementation of Collective Intelligence
Implementing collective intelligence requires addressing several open challenges, including unified governance, shared decision-making, designing agents as digital species, scalability and resource management, and discovering and exploring unexpected generalization capabilities.
6.2 Comprehensive Evaluation and Benchmarking
Evaluating the performance and behavior of MASs is more complex than evaluating individual LLMs. A unified and comprehensive benchmarking framework needs to be established to ensure the reproducibility and consistency of evaluation results.
6.3 Ethical Risks and Safety
The deployment of LLMs in multi-agent systems may amplify the risks of hallucinations and adversarial attacks. Ensuring that agents operate within ethical boundaries and preventing harmful behaviors is crucial for ensuring the safety and reliability of the system.
7 Conclusion
Through a comprehensive review of LLM multi-agent collaboration mechanisms, we propose a structured and scalable framework to guide future research. Our framework is based on five key dimensions: participants, types, structures, strategies, and coordination mechanisms, providing a systematic approach to analyze and design LLM multi-agent collaborative interactions. We believe that this work will inspire future research and promote the advancement of MASs towards more intelligent and collaborative solutions.
About Us
Data Party THU, as a public account for data science, backs the Tsinghua University Big Data Research Center, sharing cutting-edge data science and big data technology innovation research dynamics, continuously disseminating data science knowledge, striving to build a data talent aggregation platform, and creating the strongest group of big data in China.
Sina Weibo: @Data Party THU
WeChat Video Number: Data Party THU
Today’s Headlines: Data Party THU