Comprehensive Analysis of MetaGPT: An LLM Framework for Software Development

AI Agents are considered by many to be the future direction of large model development. Previously, Lilian Weng, head of OpenAI’s safety team, published a detailed blog introducing AI autonomous agent robots, which attracted a lot of attention. Released in July, MetaGPT is a brand new AI Agent project that provides an automated agent framework focused on software development based on GPT-4. It can almost be understood as a small team equipped with product managers, system designers, and programmers, capable of directly generating the final code project based on initial requirements. This article mainly introduces this project and analyzes the implementation methods behind it.

Comprehensive Analysis of MetaGPT: An LLM Framework for Software Development

  • Introduction to MetaGPT

  • Implementation Principles of MetaGPT

  • Built-in Tasks and Skills of MetaGPT

  • Operation and Testing Results of MetaGPT

  • Conclusion of MetaGPT

Introduction to MetaGPT

At the end of June, a developer from Shenzhen created a MetaGPT project on GitHub. This is a brand new large model framework based on AI Agents, and unlike previous projects, this one focuses on software development, covering the entire process from requirement analysis to code implementation. It can provide complete services from requirement analysis, system design, code implementation, to code review based on your initial requirements. Today, one month after its release, it has reached 10,000 stars on GitHub and is currently ranked first on GitHub’s trending list!

From the demonstration video, you only need to provide a simple initial requirement, and the project can utilize GPT-4 to complete the final product implementation. The following image shows a complete software development team achieved internally through prompts in MetaGPT:

Comprehensive Analysis of MetaGPT: An LLM Framework for Software Development

It can be seen that the roles are basically equipped according to those involved in modern software development teams. However, having such a process and roles is likely not the work of a small team. In a relatively complete software development team, once an initial requirement is proposed, it needs to go through a series of steps including requirement analysis, requirement review, system requirement analysis, system design, code implementation, code review, and code testing. Such a development process can effectively ensure the quality of code development, but the process is long and not very flexible. If a large model can realize this process, it would clearly greatly help the quality and efficiency of software development.

Implementation Principles of MetaGPT

Compared to the previous AutoGPT, the implementation principles of MetaGPT are slightly more complex. It defines several roles in the system, equipping each role with goals and prompt templates to guide the relevant roles in solving corresponding problems. The main roles include:

Comprehensive Analysis of MetaGPT: An LLM Framework for Software Development

From the summary in the table above, it can be seen that MetaGPT accomplishes the step-by-step realization process of initial requirements by defining several roles. The overall entry point is the boss’s requirement, and then each role will process the initial requirement according to the steps mentioned above.

An interesting setting of MetaGPT is that it defines a separate “process” for each role to run. Each role waits for whether there is corresponding input during its operation, and once it observes the corresponding input, it will immediately use the large model to solve the problem based on its goals and return the result to the system. Other roles in the system can execute tasks once they listen to their task input. This is very similar to the current development process and organizational structure.

Additionally, to empower each role with stronger capabilities, MetaGPT also sets up extra roles to help the system complete tasks, including search roles, prompt decomposition roles, and so on.

Built-in Tasks and Skills of MetaGPT

Clearly, from the above analysis, we can see that the essence of MetaGPT’s solution is to pre-set multiple roles, each with its own goals as well as inputs and outputs, which will process based on the inputs observed in the environment that are relevant to themselves.

So, what abilities and skills do these roles have? Here is a summary. It is important to note that the skills summarized here are in a separate file, and each role can enhance its capabilities by incorporating these skills:

  • Analyze Codebase: analyze_dep_libs.py

  • Azure Text-to-Speech: azure_tts.py

  • Debug: debug_error.py

  • Design API: design_api.py

  • API Review: design_api_review.py

  • Design Filenames: design_filenames.py

  • Project Management: project_management.py

  • Run Code: run_code.py

  • Search and Summarize: search_and_summarize.py

  • Write Code: write_code.py

  • Write Code Review: write_code_review.py

  • Write Requirement Specification: write_prd.py

  • Write Requirement Specification Review: write_prd_review.py

  • Write Test Cases: write_test.py

In simple terms, each of the above skills corresponds to a .py file, and each .py file defines the prompt template for the corresponding skill. Ultimately, each role processes inputs and outputs using all the skills they possess to complete tasks.

Operation and Testing Results of MetaGPT

MetaGPT has attracted many testers, and one representative task was completed by a person in 10 minutes, developing a Flappy Bird game:

Comprehensive Analysis of MetaGPT: An LLM Framework for Software Development

And this operation only required the following code:

python startup.py "write p5.js code for Flappy Bird where you control a yellow bird continuously flying between a series of green pipes. The bird flaps every time you left click the mouse. If the bird falls to the ground or hits a pipe, you lose. This game goes on infinitely until you lose and you get points the further you go"--code_review True

Next, the program runs completely automatically, ultimately producing the results shown above. However, the final result cannot be run directly; we still need to add materials and other results. The author used Midjourney for assistance.

Running MetaGPT is very simple; after installing according to the official project recommendations, just run the startup.py script directly. This script can be set with the following parameters:

NAME
    startup.py -We are a software startup comprised of AI. By investing in us, you are empowering a future filled with limitless possibilities.

SYNOPSIS
    startup.py IDEA <flags>

DESCRIPTION
We are a software startup comprised of AI. By investing in us, you are empowering a future filled with limitless possibilities.

POSITIONAL ARGUMENTS
    IDEA
Type: str
Your innovative idea, such as "Creating a snake game."

FLAGS
--investment=INVESTMENT
Type: float
Default: 3.0
As an investor, you have the opportunity to contribute a certain dollar amount to this AI company.
--n_round=N_ROUND
Type: int
Default: 5

NOTES
You can also use flags syntax for POSITIONAL ARGUMENTS

It should be noted that it consumes the results of the OpenAI API Key. The way to end the run is either by exhausting investments or by reaching the maximum number of iterations!

Conclusion of MetaGPT

In fact, there have already been many projects using LLM as the core to build agent robots to complete tasks. The early AutoGPT was quite impressive (see: How AutoGPT Enables GPT-4 to Automatically Help You Complete Tasks – The Most Popular AutoGPT Principle Analysis!: https://www.datalearner.com/blog/1051681400812596), and later, Lilian Weng’s blog from OpenAI made it clearer about the future prospects of AI Agents (see: AI Agent Driven by Large Models: A Way to Turn the Capabilities of Language Models into General Capabilities – Explanation and Views from the Head of OpenAI’s Safety Team: https://www.datalearner.com/blog/1051689842100145).

The advantage of MetaGPT is that it constructs very refined roles targeted at specific domains, and the quality should be higher compared to general AI Agents. Moreover, its organizational form that resembles human software development companies is also worth reflecting on.

The project address for MetaGPT can be found at the end of the original text.

Special Announcement!

We have established a discussion group for AI technology exchange. Currently, groups 1-4 are full, please join group 5 for discussions. Everyone can discuss AI-related technical issues and progress in the group~ Since the group is full at 200, you cannot join automatically and need an invitation. Please add my WeChat, and I will invite you to the group~ (WeChat ID: datalearner_ai, WeChat group is limited to AI-related technical exchanges) The WeChat account QR code is as follows:

Comprehensive Analysis of MetaGPT: An LLM Framework for Software Development

Leave a Comment