Implementing a Technical Document Assistant with MetaGPT

Overview of Articles in the Official Account

Implementing a Technical Document Assistant with MetaGPT

If you have any questions, feel free to add WeChat: jasper_8017. Looking forward to discussing and progressing together with like-minded friends!
Follow 👇 the official account and learn AI technology together!

In the article 【The Future of AI – AI Agent Series】【MetaGPT】2. Implementing Your First Agent, we have already implemented a simple agent that prints numbers sequentially.

0. Content of This Article

Today we will implement a practical agent – a technical document assistant. The user only needs to input the title of the technical document, for example, “Git Tutorial”, and the agent will automatically write the Git tutorial into a document, structured with a table of contents and clear sections, along with code examples.

First, let’s look at the expected outcome (The user only needs to input “Git Tutorial” throughout the process):

  • Markdown format

  • Structured with a table of contents, including first-level and second-level headings

  • Includes code examples

Implementing a Technical Document Assistant with MetaGPT

1. Implementation Approach

Due to token limitations, we first generate the tutorial’s table of contents using a large language model (LLM), then split the contents according to the second-level headings. For each section of the table of contents, we generate detailed content based on the headings and finally concatenate the headings and content to address the long text limitations of the LLM.

The overall process is as follows (the image below is from “Introduction to MetaGPT Agent Development”):

Implementing a Technical Document Assistant with MetaGPT

Analyzing the above flowchart, we need to implement:

  • Action to generate the document outline: WriteDirectory

  • Subtask Action: WriteContent

  • After obtaining the document outline, we need to split the outline (in this case, by directory) and dynamically add subtask Actions to write the technical document content based on the outline

  • Finally, concatenate the content generated by the subtask Actions to form the final Markdown document

2. Complete Code and Detailed Comments

Here is the complete code, with some detailed comments added to help you understand. It uses MetaGPT version 0.5.2. It is recommended that you practice it yourself; without practice, you will never know how many pitfalls you will encounter…

The code is not complex

  • Implementation of WriteDirectory: Essentially, we put our requirements into a prepared prompt template, ask the large model for results, and then parse the obtained content. (Data formatting)

  • Implementation of WriteContent: Directly call the large model to generate responses based on the provided subheading content

# Load .env into environment variables
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

from datetime import datetime
from typing import Dict
import asyncio
from metagpt.actions.write_tutorial import WriteDirectory, WriteContent
from metagpt.const import TUTORIAL_PATH
from metagpt.logs import logger
from metagpt.roles import Role
from metagpt.schema import Message
from metagpt.utils.file import File
import fire
import time

from typing import Dict

from metagpt.actions import Action
from metagpt.prompts.tutorial_assistant import DIRECTORY_PROMPT, CONTENT_PROMPT
from metagpt.utils.common import OutputParser

## 1. Generate document outline
class WriteDirectory(Action):
    """Action class for writing tutorial directories.

    Args:
        name: The name of the action.
        language: The language to output, default is "Chinese".
    """

    def __init__(self, name: str = "", language: str = "Chinese", *args, **kwargs):
        super().__init__(name, *args, **kwargs)
        self.language = language

    async def run(self, topic: str, *args, **kwargs) -> Dict:
        COMMON_PROMPT = """
        You are now a seasoned technical professional in the field of the internet. 
        We need you to write a technical tutorial with the topic "{topic}".
        """

        DIRECTORY_PROMPT = COMMON_PROMPT + """
        Please provide the specific table of contents for this tutorial, strictly following the following requirements:
        1. The output must be strictly in the specified language, {language}.
        2. Answer strictly in the dictionary format like {{"title": "xxx", "directory": [{{"dir 1": ["sub dir 1", "sub dir 2"]}}, {{"dir 2": ["sub dir 3", "sub dir 4"]}}]}}.
        3. The directory should be as specific and sufficient as possible, with a primary and secondary directory.The secondary directory is in the array.
        4. Do not have extra spaces or line breaks.
        5. Each directory title has practical significance.
        """
        prompt = DIRECTORY_PROMPT.format(topic=topic, language=self.language)
        resp = await self._aask(prompt=prompt)
        return OutputParser.extract_struct(resp, dict) ## 1.1 Validate the result, must conform to Dict structure, otherwise throw an error

## 2. Subtask Action, this is to write technical document content based on the split directory titles
class WriteContent(Action):
    """Action class for writing tutorial content.

    Args:
        name: The name of the action.
        directory: The content to write.
        language: The language to output, default is "Chinese".
    """

    def __init__(self, name: str = "", directory: str = "", language: str = "Chinese", *args, **kwargs):
        super().__init__(name, *args, **kwargs)
        self.language = language
        self.directory = directory

    async def run(self, topic: str, *args, **kwargs) -> str:
        COMMON_PROMPT = """
        You are now a seasoned technical professional in the field of the internet. 
        We need you to write a technical tutorial with the topic "{topic}".
        """
        CONTENT_PROMPT = COMMON_PROMPT + """
        Now I will give you the module directory titles for the topic. 
        Please output the detailed principle content of this title in detail. 
        If there are code examples, please provide them according to standard code specifications. 
        Without a code example, it is not necessary.

        The module directory titles for the topic is as follows:
        {directory}

        Strictly limit output according to the following requirements:
        1. Follow the Markdown syntax format for layout.
        2. If there are code examples, they must follow standard syntax specifications, have document annotations, and be displayed in code blocks.
        3. The output must be strictly in the specified language, {language}.
        4. Do not have redundant output, including concluding remarks.
        5. Strict requirement not to output the topic "{topic}".
        """
        prompt = CONTENT_PROMPT.format(
            topic=topic, language=self.language, directory=self.directory)
        return await self._aask(prompt=prompt)

## 3. Technical document role, used to execute Action
class TutorialAssistant(Role):
    def __init__(self,
        name: str = "Stitch",
        profile: str = "Tutorial Assistant",
        goal: str = "Generate tutorial documents",
        constraints: str = "Strictly follow Markdown's syntax, with neat and standardized layout",
        language: str = "Chinese",
    ):
        super().__init__(name, profile, goal, constraints)
        self._init_actions([WriteDirectory(language=language)]) ## 3.1 Initialize by adding only WriteDirectory Action to generate the directory. WriteContent Action will be added dynamically based on the directory later, and you won't know how many to add or what the content is.
        self.topic = ""
        self.main_title = "" ## 3.2 Record the article title
        self.total_content = "" ## 3.3 All generated content will be concatenated here
        self.language = language

    async def _think(self) -> None:
        """Determine the next action to be taken by the role."""
        if self._rc.todo is None:
            self._set_state(0) ## 3.4 Move to the first Action execution
            return

        if self._rc.state + 1 < len(self._states):
            self._set_state(self._rc.state + 1) ## 3.5 Execute the next Action
        else:
            self._rc.todo = None

    ## 3.6 Dynamically add WriteContent Actions based on the generated directory, the input titles must be of Dict type, which requires the output of WriteDirectory to be parsable as Dict, otherwise throw an error, and the program cannot continue execution.
    async def _handle_directory(self, titles: Dict) -> Message:
        self.main_title = titles.get("title")
        directory = f"{self.main_title}\n"
        self.total_content += f"# {self.main_title}"
        actions = list()
        for first_dir in titles.get("directory"):
            actions.append(WriteContent(
                language=self.language, directory=first_dir)) ## 3.7 Dynamically add WriteContent Action, passing in first-level directory content
            key = list(first_dir.keys())[0]
            directory += f"- {key}\n"
            for second_dir in first_dir[key]:
                directory += f"  - {second_dir}\n"
        self._init_actions(actions) ## 3.8 After executing this line, the action list is all WriteContent
        self._rc.todo = None
        return Message(content=directory)

    async def _act(self) -> Message:
        """Perform an action as determined by the role.

        Returns:
            A message containing the result of the action.
        """
        time.sleep(20) ## 3.9 This is to avoid the OpenAI API call frequency limit, not a good solution
        todo = self._rc.todo
        if type(todo) is WriteDirectory: 
            msg = self._rc.memory.get(k=1)[0] ## 3.10 Retrieve memory, here is to get user input, as no actions have been executed yet, only user input is available
            self.topic = msg.content
            resp = await todo.run(topic=self.topic) ## 3.11 Generate directory based on user input
            logger.info(resp)
            return await self._handle_directory(resp)
        resp = await todo.run(topic=self.topic) ## 3.12 At this point, all are WriteContent Actions. Here, self.topic is still the user input, as there is no other place to update that value. The purpose of this input is to limit the content written by WriteContent to this scope
        logger.info(resp)
        if self.total_content != "":
            self.total_content += "\n\n\n"
        self.total_content += resp ## 3.13 Concatenate data
        return Message(content=resp, role=self.profile)

    async def _react(self) -> Message:
        """Execute the assistant's think and actions.

        Returns:
            A message containing the final result of the assistant's actions.
        """
        while True:
            await self._think()
            if self._rc.todo is None:
                break
            msg = await self._act()

        ## 3.14 All Actions executed, write to file
        root_path = TUTORIAL_PATH / datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
        await File.write(root_path, f"{self.main_title}.md", self.total_content.encode('utf-8'))
        return msg

async def main():
    msg = "Git Tutorial"
    role = TutorialAssistant()
    logger.info(msg)
    result = await role.run(msg)
    logger.info(result)

asyncio.run(main())

If this article helps you, please give it aImplementing a Technical Document Assistant with MetaGPTlike and follow before you go~

Overview of Articles in the Official Account

Implementing a Technical Document Assistant with MetaGPT

If you have any questions, feel free to add WeChat: jasper_8017. Looking forward to discussing and progressing together with like-minded friends!
Follow 👇 the official account and learn AI technology together!

Leave a Comment