Multi-Agent Collaboration Mechanisms: A Review of Large Language Models

With the latest advancements in large language models (LLMs), agentic artificial intelligence (Agentic AI) has made significant progress in real-world applications, moving towards intelligent agents based on multiple large language models that achieve perception, learning, reasoning, and collaborative actions. These multi-agent systems (MASs) based on large language models enable a group of agents to collaborate to solve complex tasks and achieve collective action on a large scale, shifting from isolated models to a cooperation-centric approach.

This article provides a comprehensive overview of collaboration mechanisms in multi-agent systems and proposes a scalable framework to guide future research. Our framework characterizes collaboration mechanisms based on key dimensions: participants (the involved agents), types (e.g., cooperation, competition, or cooperative competition), structures (e.g., peer-to-peer, centralized, or distributed), strategies (e.g., role-based or model-based), and coordination protocols. By reviewing existing methods, our findings lay the groundwork for revealing and advancing the development of multi-agent systems based on large language models towards more intelligent and collaborative solutions, especially in complex real-world applications.

Furthermore, this article explores various applications of multi-agent systems in different fields, including 5G/6G networks, Industry 5.0, question-answering systems, and socio-cultural environments, demonstrating their wide-ranging applications and far-reaching impacts. Finally, we summarize key lessons learned, analyze the open challenges faced by multi-agent systems, and point out potential research directions towards the development of artificial collective intelligence.

CCS Concepts: • General and reference → Surveys and overviews; • Computing methodologies → Multi-agent systems; Natural language generation; Neural networks.

Additional Keywords: Artificial intelligence, large language models, multi-agent collaboration

1 Introduction

1.1 Motivation The recent advancements in large language models (LLMs) have fundamentally transformed artificial intelligence (AI), enabling it to perform complex tasks such as creative writing, reasoning, and decision-making, comparable to human levels [156]. Despite these models exhibiting significant capabilities when used individually, they still face inherent limitations, such as the hallucination problem [57], autoregressive characteristics (e.g., inability to perform deep reasoning [49]), and scalability issues [55, 69]. To address these challenges, agentic artificial intelligence (Agentic AI) utilizes LLMs as brains or coordinators, combining them with external tools and tasks (such as planning), enabling LLM-based agents to take actions, solve complex problems, and interact with the external environment.

Moreover, researchers are increasingly exploring horizontal scaling—utilizing multiple LLM-based agents to collaborate for collective intelligence. This approach aligns with the research directions of multi-agent systems (MASs) and collaborative AI, which focus on enabling a group of agents to coordinate, share knowledge, and jointly solve problems. The integration of these fields has given rise to multi-agent systems based on LLMs, leveraging the collective intelligence of multiple LLMs to tackle complex multi-step challenges [118]. The inspiration for MASs comes not only from technological advancements but also from human collective intelligence (e.g., social mind [87], theory of mind [45]). Human societies excel at achieving shared goals through teamwork and specialization, evident in everyday tasks and scientific discoveries. Similarly, MASs aim to simulate these principles, allowing AI agents to collaborate effectively by combining their respective strengths and perspectives.

LLM-based MASs can collaborate through different channels, each characterized by distinct features, as shown in Figure 1. MASs have achieved remarkable success across various domains by enhancing the capabilities of individual LLMs through collaboration and coordination among agents. These systems assign tasks to agents, allowing them to share knowledge, perform subtasks, and align their efforts with shared goals. The potential advantages of MASs are transformative. They excel in knowledge retention, enabling distributed agents to maintain and share diverse knowledge bases without overloading a single system [51, 154]. They enhance long-term planning by allocating tasks to different agents, supporting continuous problem-solving in ongoing interactions [58]. Furthermore, MASs can achieve more effective generalization by aggregating expertise and prompts/roles from multiple models, solving various problems more efficiently than a single model. Finally, MASs aim to achieve collective intelligence, where the combined capabilities of multiple agents exceed the sum of their individual contributions [24].

A major focus of effective MASs is the collaboration mechanisms [33, 74, 75, 97, 132] that facilitate the transition from traditional isolated models to interaction-emphasizing approaches, enabling agents to connect, negotiate, make decisions, plan, and execute tasks together, thereby advancing AI capabilities in collective environments. A deep understanding of collaboration mechanisms in MASs is key to unlocking their full potential.

1.2 Current Status and Contributions Given the importance and urgency of LLM-based multi-agent collaborative systems, several review articles have discussed this topic. However, these works often fail to adequately explore the collaborative aspects and mechanisms of LLM-based MASs, which are crucial for enabling agents to collaborate efficiently towards shared goals, as summarized in Table 1. For instance, [47, 107, 136] mainly focus on single-agent systems, mentioning multi-agent collaboration only at a superficial level.

[136] laid the groundwork for LLM agents by proposing a framework consisting of three parts: brain, perception, and action. This work emphasized using LLMs as the brain of agents, integrating techniques such as input modality fusion, prompting, retrieval, and tool usage. However, their discussion of multi-agent collaboration was limited to agent behavior and personality, without delving into the mechanisms that enable agents to work together. [47] reviewed the fields and scenarios of successful applications of LLM-based MASs, outlining the communication structures of these systems (hierarchical, decentralized, centralized, and shared message pools), but did not address other features of collaboration, such as types, strategies, or coordination architectures.

Other works, such as [82], focus on collaborative strategies, classifying them as merging, integration, and cooperation. Although their review discussed how these strategies apply to LLMs and extended collaboration beyond traditional fusion techniques, they overlooked other important collaboration mechanisms such as competition and cooperative competition, as well as dimensions that go beyond popular collaboration types. In contrast, [120] proposed a general framework for enhancing LLM capabilities, showcasing how tools like Auto-GPT interface with this framework. However, the collaboration mechanisms remained at a conceptual level, lacking detailed implementation and characterization. The research in [50] focused on configuring LLMs to leverage diverse capabilities and roles, such as integrating memory and information retrieval components. Their exploration of multi-agent collaboration primarily centered on planning and coordination architectures, emphasizing global and local task planning based on agent roles and specialization. Meanwhile, [46] narrowed the focus to the applications of LLM-based MASs in agent-based modeling and simulation, discussing challenges such as environmental perception, human alignment, action generation, and evaluation. While these works are insightful for specific applications in simulation, they lack a broad perspective on in-depth collaboration mechanisms. Similarly, [68] reviewed the applications of these systems in digital twin applications, while [52, 70] focused on the software engineering domain.

From the above summary and explanation, it is evident that there is a significant gap in fully exploring the collaborative aspects and mechanisms of LLM-based MASs, which are crucial for enabling agents to collaborate effectively towards shared goals. This work aims to provide a comprehensive perspective on the foundations of inter-agent collaboration in LLM-based multi-agent collaborative systems. With collaboration as the primary focus, our research characterizes inter-agent collaboration in LLM-based agents as: participants (the involved agents), types (such as cooperation, competition, or cooperative competition), structures (such as peer-to-peer, centralized, or distributed), strategies (such as role-based, rule-based, or model-based), and coordination layers. We emphasize the mechanisms and “tricks” that facilitate effective collaboration, identifying key features and trends in MAS design. By reviewing existing methods and identifying open challenges, we synthesize these findings into a coherent framework. This framework provides a foundation for future research, promoting the integration of LLMs in MASs and paving the way for more adaptive, intelligent, and collaborative AI systems capable of addressing complex real-world applications.

Our main contributions are as follows:

Collaboration Aspects and Mechanisms in LLM-Based MASs: We focus on the operational mechanisms of LLM-based multi-agent collaboration, emphasizing the “tricks” necessary for achieving effective collaboration, including collaboration types, strategies, communication structures, and coordination architectures.
General Framework for LLM-Based MASs: We propose a comprehensive framework that integrates various features of MASs, helping researchers understand, design, and develop multi-agent collaborative systems.
Review of Real-World Applications: We examine the practical applications of LLM-based MASs in different fields, highlighting their real-world applications, success stories, and limitations.

Discussion of Lessons Learned and Open Issues: We identify key challenges in the development of MASs, such as collective reasoning and decision-making, and outline potential research directions for addressing these challenges.

1.3 Paper Structure

The organization of this paper is as follows. Section 2 provides background information necessary for understanding this research, including an overview of LLMs, MASs, and collaborative AI. Section 3 introduces the foundational concepts of LLM-based multi-agent collaborative systems, emphasizing the key role of collaboration through mathematical notation. Next, Section 4 provides a comprehensive review of LLM-based multi-agent collaborative systems, categorizing them based on key collaboration features, including types, strategies, structures, and coordination and orchestration. Section 5 reviews the key applications of LLM-based multi-agent collaborative systems in industrial and academic fields. Section 6 discusses open issues and potential future research directions in this emerging research area. Finally, Section 7 concludes the paper.

2 Background Knowledge

2.1 Multi-Agent Systems (MAS)

Multi-agent systems (MAS) are computerized systems composed of multiple interacting agents. These agents possess autonomy, enabling them to perceive the environment, interact with other agents, and collaboratively solve complex tasks. Key components of MAS include:

Agents: The core participants in the system, possessing roles, capabilities, behavioral patterns, and knowledge models. The capabilities of agents include learning, planning, reasoning, and decision-making, which endow the agents and the entire system with intelligence.
Environment: The external world in which agents operate, allowing agents to perceive and act upon it. The environment can be simulated or physical spaces, such as factories, roads, power grids, etc.
Interaction: Agents communicate through standard agent communication languages. Interactions among agents include collaboration, coordination, negotiation, etc., depending on the system’s requirements.
Organization: Agents can be controlled hierarchically or organized based on emergent behaviors.

Notable features of MAS include flexibility, reliability, self-organization, and real-time operation, making them effective solutions for solving complex tasks. By decomposing complex tasks into multiple subtasks, each handled by different agents, MAS can complete tasks at lower costs and with higher reliability.

2.2 Large Language Models (LLMs)

Large language models (LLMs) are deep learning models based on the transformer architecture, such as OpenAI’s GPT, Meta’s LLaMA, and Google’s Gemini series. These models possess strong language understanding and generation capabilities through training on large-scale text corpora. The scale of LLMs enables them to solve tasks that have not been explicitly trained on, such as analogy reasoning and zero-shot learning.

Although LLMs perform excellently in single-agent tasks, the complexity of coordination, communication, and decision-making increases in multi-agent environments, making the limitations of LLMs more apparent. For example, the cascading hallucination problem is particularly prominent in multi-agent interactions, where an erroneous output may lead to a chain of erroneous responses. However, by introducing structured workflows and consensus mechanisms, researchers are exploring ways to overcome these challenges, enabling LLMs to play a larger role in multi-agent environments.

2.3 Collaborative Artificial Intelligence (Collaborative AI)

Collaborative artificial intelligence (Collaborative AI) refers to AI systems designed to collaborate with other AI agents or humans. Research directions in collaborative AI include multi-agent systems, human-computer interaction, game theory, and natural language processing. By integrating these technologies, collaborative AI has the potential to drive new applications with profound economic and social impacts.

Collaboration is key to enabling AI agents to interact and cooperate with each other. Collaboration mechanisms extend beyond simple cooperation to include advanced mechanisms such as competition and cooperative competition. Collaborative AI promotes the transition from traditional isolated models to collaboration-emphasizing approaches, enabling agents to interact, negotiate, make decisions, plan, and act together, thereby advancing AI capabilities in collective environments.

3 Multi-Agent Collaboration Concepts

3.1 Definitions of Agents and Collaboration Systems

In LLM multi-agent collaboration systems, agents can be mathematically represented as a={m,o,e,x,y}, where:

Model mm: AI model, including its architecture (arch), specific memory (mem) of the agent, and optional adapters (adp). Adapters are adaptive intelligent modules that allow agents to acquire additional knowledge from other agents through mechanisms such as speculative decoding and parameter-efficient adapters.
Objective oo: The goal or task of the agent, guiding its behavior in the system.
Environment ee: The environment or context in which the agent operates, typically limited by the number of tokens in the context window.
Input xx: The input perception of the agent, such as text or sensor data.
Output yy: The corresponding action or output of the agent, defined as y=m(o,e,x).

3.2 Problem Definition

In LLM-driven MAS, collaboration among agents is crucial. Each collaboration has a communication channel cc, and collaboration includes:

Task allocation: Assigning tasks to multiple agents based on their unique expertise and resources.
Collaboration mechanisms: Defining the collaboration mechanisms among agents to enable them to work together.
Decision-making: Decision-making among agents to achieve the ultimate goal.

The collaboration channels are characterized by specific attributes, including participants (the involved agents), types (such as cooperation, competition, or cooperative competition), structures (such as peer-to-peer, centralized, or distributed), and strategies (such as role-based, rule-based, or model-based). Any differences in attributes lead to different collaboration channels.

4 Methodology

4.1 Collaboration Types

4.1.1 Cooperation

Cooperation is the most common type of collaboration in LLM multi-agent systems. When agents align their individual goals oi with a shared collective goal Ocollab, they work together to achieve mutually beneficial outcomes. Cooperation is particularly important in tasks requiring collaborative problem-solving, collective decision-making, and complementary skills.

4.1.2 Competition

Competition occurs when individual goals oi conflict with the goals of other agents or when resources are limited. Competition can drive agents to develop more advanced reasoning capabilities and creative problem-solving skills, enhancing the adaptability of the system.

4.1.3 Cooperative Competition

Cooperative competition is a combination of cooperation and competition, where agents collaborate on certain tasks while competing on others. Cooperative competition mechanisms are particularly effective in scenarios requiring trade-offs and compromises.

4.2 Collaboration Strategies

4.2.1 Rule-Based Protocols

Rule-based protocols strictly control interactions among agents through predefined rules, ensuring that agents coordinate actions within the constraints of the system. This strategy is highly effective in situations where task procedures are clear and changes are limited, but it lacks adaptability in the face of unexpected situations.

4.2.2 Role-Based Protocols

Role-based protocols enable agents to focus on subtasks within their areas of expertise by assigning specific roles or divisions of labor to each agent. This strategy enhances the efficiency and structure of the system but may lead to conflicts or functional defects when role definitions are unclear.

4.2.3 Model-Based Protocols

Model-based protocols provide flexibility in decision-making in environments where input perceptions are uncertain. Agents make probabilistic decisions based on their perceptions of the environment, shared goals, and inherent uncertainties. This strategy excels in dynamic environments but is complex to design and deploy, with higher computational costs.

4.3 Communication Structures

4.3.1 Centralized Structure

In a centralized structure, all agents are connected to a central agent, which is responsible for managing and coordinating interactions among agents. Centralized structures have advantages in resource allocation and task coordination, but the failure of the central node can lead to the collapse of the entire system.

4.3.2 Distributed Structure

In a distributed structure, control and decision-making are distributed among multiple agents, each operating based on local information and limited communication. Distributed structures have advantages in system fault tolerance and scalability, but resource allocation efficiency is lower, and communication overhead is higher.

4.3.3 Hierarchical Structure

In a hierarchical structure, agents are organized in layers, with each layer of agents having different functions and authorities. Hierarchical structures have advantages in task allocation and resource management, but failures of edge devices can lead to system failures.

4.4 Coordination and Orchestration

4.4.1 Static Architecture

Static architectures rely on domain knowledge and predefined rules to establish collaboration channels. This architecture has advantages in task execution consistency and leveraging domain knowledge, but lacks flexibility in the face of dynamic environments.

4.4.2 Dynamic Architecture

Dynamic architectures can adapt to changing environments and task requirements by managing agents or adaptive mechanisms to allocate roles and define collaboration channels in real-time. Dynamic architectures excel in handling complex and dynamic tasks but have higher resource usage, and dynamic adjustments may fail.

5 Practical Applications

5.1 5G/6G Networks and Industry 5.0

The application of LLMs in 5G/6G networks and Industry 5.0 significantly enhances the performance of edge networks. For example, the LLM-SC framework utilizes LLMs to model the semantic information of text, designing a semantic communication system based on LLMs that achieves a balance between semantic-level and technical-level performance.

5.2 Question Answering and Natural Language Generation (QA/NLG)

The application of LLMs in question answering and natural language generation tasks significantly enhances the capabilities of systems. For instance, OpenAI’s Swarm framework introduces a “handoff” mechanism that allows multiple agents to collaborate seamlessly, improving system efficiency and adaptability.

5.3 Social and Cultural Domains

The application of LLMs and MASs in social and cultural domains demonstrates the potential of these systems to simulate human behavior, social dynamics, and cultural interactions. For example, the CulturePark framework promotes cross-cultural understanding and reduces bias by simulating cross-cultural interactions.

6 Open Issues and Discussion

6.1 Achieving Collective Intelligence

Achieving collective intelligence requires addressing multiple open challenges, including unified governance, shared decision-making, the design of agents as digital species, scalability and resource management, and discovering and exploring unexpected generalization capabilities.

6.2 Comprehensive Evaluation and Benchmarking

Evaluating the performance and behavior of MASs is more complex than evaluating individual LLMs. A unified and comprehensive benchmarking framework needs to be established to ensure the repeatability and consistency of evaluation results.

6.3 Ethical Risks and Safety

The deployment of LLMs in multi-agent systems may amplify the risks of hallucinations and adversarial attacks. Ensuring that agents operate within ethical boundaries and preventing harmful behaviors is key to ensuring the safety and reliability of systems.

7 Conclusion

Through a comprehensive review of the collaboration mechanisms of LLM multi-agent systems, we propose a structured and scalable framework to guide future research. Our framework is based on five key dimensions: participants, types, structures, strategies, and coordination mechanisms, providing a systematic approach to analyzing and designing LLM multi-agent collaborative interactions. We believe this work will inspire future research and drive MASs towards more intelligent and collaborative solutions.

For convenient viewing, please visit the following URL or click “Read the Original” at the bottom

https://www.zhuanzhi.ai/vip/69e2119765302e50e49cfc3d7dd3866c

Click “Read the Original“, to view and download this article