One important capability of LLMs (Large Language Models) is their ability to generate code based on user requirements. This greatly improves engineers’ efficiency.
LLMs also possess strong task understanding and decomposition abilities. Given a topic, they can summarize the key points and provide satisfactory analytical responses for each node.
However, LLMs currently have many limitations, such as context size restrictions and the inability to assume multiple roles to solve a problem from different perspectives. Thus, LLMs cannot yet provide systematic responses for vague issues, leading to the emergence of many AI agents.
AI agents are software driven by AI (mainly large language models) that can iteratively perform finer-grained task decomposition, task creation, and ultimately complete tasks based on an input goal. AI agents have features like short and long-term memory, various tools, etc., to assist in achieving their goals.
Deep Intelligence provides an intuitive definition of an agent: Agent = LLM + Memory + Planning + Tools + Neural + Intuition.
data:image/s3,"s3://crabby-images/359b1/359b167293eff570ea1feb57739698794ae0f2ff" alt="AI Workflow: Using MetaGPT for Solo Software Development"
Currently, well-known agents include Stanford’s West World Town, BabyAGI, AutoGPT, etc.
MegaGPT is an outstanding multi-agent framework that currently includes necessary agent roles for a software company: product manager, architect, project manager, programmer, etc. As the boss, you can input some simple requirements, and MetaGPT will follow the established SOP to assign tasks to different roles, breaking down the objectives step by step and ultimately generating complete project code.
data:image/s3,"s3://crabby-images/28444/284448064a75cb8b128cb9a945f5894484efe394" alt="AI Workflow: Using MetaGPT for Solo Software Development"
Today, let’s practice using MetaGPT, the AI agent for solo software companies.
Preparation
-
A computer that can smoothly access OpenAI, with Anaconda installed. -
A ChatGPT API Key, preferably for GPT-4, which currently has the strongest coding ability; alternatively, GPT-3.5 can be used.
Creating Environment
conda create -n metagpt python=3.9
Installing Node.js
Visit the Node.js official website at nodejs.org/en
data:image/s3,"s3://crabby-images/d8f0f/d8f0f0d937cb24cef0a2b3348b91afb8ac09f374" alt="AI Workflow: Using MetaGPT for Solo Software Development"
Select version 18.16.1 LTS, download, and install it.
Installing Other Dependencies
activate metagpt
npm install mermaid.cli
pip install numpy
pip install pybind11
pip install anthropic
Installing VSCode
Download the Community version of Visual Studio from Microsoft.
visualstudio.microsoft.com/zh-hans/downloads/
data:image/s3,"s3://crabby-images/49359/49359b68c79a844573de08ef28c6febd2e919957" alt="AI Workflow: Using MetaGPT for Solo Software Development"
Select the installation options below, which will require about 8GB of disk space.
data:image/s3,"s3://crabby-images/10e9b/10e9bca32d436f22068bfcfa561b609cb5484b1f" alt="AI Workflow: Using MetaGPT for Solo Software Development"
Downloading and Installing the Project
cd c:\AiWorkFlow
git clone https://github.com/geekan/MetaGPT.git
cd MetaGPT
python setup.py install
The installation process should go smoothly; if not, install the corresponding packages manually according to the error messages.
data:image/s3,"s3://crabby-images/ddded/dddedc2379b26db463ed73044384e43433f80325" alt="AI Workflow: Using MetaGPT for Solo Software Development"
Configuring the Project
Create a key.yaml file in the metaGPT/config directory and fill in your OpenAI API key. There is also an OPENAI_API_BASE parameter, which is a feature set for some domestic users to access the OpenAI server via a reverse proxy URL. If your node can access it directly, you can fill in the OpenAI URL directly.
OPENAI_API_KEY: "sk-AB................EF"
OPENAI_API_BASE: "https://api.openai.com/v1"
Open the metaGPT/config/config.yaml file, which defaults to using the GPT-4 model. If using the GPT-3.5 API, you need to modify the configuration file to use the 3.5 16k model:
OPENAI_API_MODEL: "gpt-3.5-turbo-16k-0613"
MAX_TOKENS: 1500
RPM: 10
Usage
As the boss, you can now provide your requirements. Mimicking the official example, let’s try developing a 2048 game first by entering the following command:
python startup.py "Using go write a cli 2048 game"
data:image/s3,"s3://crabby-images/81c4e/81c4ef6b11c642081b4932a8fbd797fb7e046abc" alt="AI Workflow: Using MetaGPT for Solo Software Development"
Watch the video for the entire process.
MetaGPT first sets up a virtual software company filled with AI agents (AI Agents), including product managers, architects, and software engineers, where you are the boss.
Initially, as the boss, you provide a goal; generally, the boss’s description is vague and unclear.
At this point, the product manager takes the lead, defining and refining the product. They will generate a document named prd.md that clearly describes the software product to be completed, including the original goal, decomposed objectives, User Stories, etc.
data:image/s3,"s3://crabby-images/b2972/b2972e2bc007b5ff9982c56d31e678b44b78a4f9" alt="AI Workflow: Using MetaGPT for Solo Software Development"
Next, the architect steps in, using the project’s goals, user stories, and other prd information to design the architecture, dividing the corresponding modules, data flows, data structures, and databases.
The document clearly describes the data structures used, the calling relationships between modules, etc., making it easy to understand the rationality of the generated code.
data:image/s3,"s3://crabby-images/02c36/02c367e20fa7054e22e51f36061d07930b28be33" alt="AI Workflow: Using MetaGPT for Solo Software Development"
Then, the project manager designs the corresponding interfaces based on the functions of each module in the architecture and generates the API documentation:
data:image/s3,"s3://crabby-images/5cb2e/5cb2ee97fc473ca4135d14555dee33951b76aaa0" alt="AI Workflow: Using MetaGPT for Solo Software Development"
Finally, the programmer generates the code and stores it in the code directory.
It can be seen that MetaGPT saves the complete program code, prd, architecture design documents, interface design documents, and related resource files in different directories.
data:image/s3,"s3://crabby-images/372e2/372e2f65f67c09c2f2d76c5b6825b8bbb0b4af6e" alt="AI Workflow: Using MetaGPT for Solo Software Development"
The MetaGPT documentation also mentions a QA role responsible for unit testing, but there have been no related output files in the actual generated files.
Running the Game
MetaGPT differs from ChatGPT’s Code Interpreter in that it requires us to establish a Python environment on our local machine to run. According to the actual generated code, GPT may have used different Python libraries that need to be installed manually.
The generated game uses the pygame library for development, which needs to be manually installed before running the game.
pip install pygame
python main.py
This will directly bring up the 2048 game interface.
data:image/s3,"s3://crabby-images/25a0d/25a0dcae657bdb5c749a6c680d21464d6b180fff" alt="AI Workflow: Using MetaGPT for Solo Software Development"
After a brief play, the game logic appears to be mostly correct. The entire generation process took only 2 minutes, costing about $0.02.
MetaGPT has now reached a practical state and can fulfill small software requirements. Users have successfully developed various software using MetaGPT, including Snake, Breakout, a web version of 2048, Flappy Bird, and a student management system, making it highly practical.
github.com/geekan/MetaGPT