Is Preparing Agents More Reliable Than Expecting GPT-5?

Is Preparing Agents More Reliable Than Expecting GPT-5?
Stanford professor Andrew Ng suggested in a speech that Agentic Workflow will drive significant advancements in artificial intelligence this year, possibly even surpassing the next generation of foundational models. This is a trend worth paying attention to.

Key Points

What is Agent Workflow?

Why is Andrew Ng re-promoting Agent Workflow?

Does Agent Workflow write code better than GPT-4?

What practices are involved in Agent Workflow?

Why Pay Attention to Agentic Workflow?

Andrew Ng shared his insights on the development of AI agents in a speech and expressed great excitement about their potential. He emphasized that all AI practitioners should closely monitor the concept of Agentic Workflow.

Ng stated that Agentic Workflow has great potential in enhancing the performance and output quality of AI applications. Under traditional zero-shot conditions, large language models (LLMs) are prompted to generate final outputs in one go, similar to asking a person to write an entire article without any opportunity for revisions. In contrast, Agentic Workflow allows LLMs to iterate multiple times, including steps like planning outlines, deciding whether to conduct web searches, drafting initial versions, reviewing drafts, and revising. This iterative process is crucial for human writers and can yield higher quality results for AI compared to single-shot writing.

Based on an analysis of multiple studies, Ng found that LLMs integrated with iterative workflows perform significantly better than those using stronger foundational models. For instance, in the HumanEval coding benchmark, despite an accuracy increase from GPT-3.5 to GPT-4, incorporating the iterative agent workflow into GPT-3.5 achieved an accuracy of 95.1%, a much greater improvement than that seen from GPT-3.5 to GPT-4.

Ng further pointed out that if users are currently expecting performance in zero-shot tasks from models still under development like GPT-5 or Claude 4, adopting AI agents may yield similar or even better results. He encouraged practitioners and researchers in the AI field to explore and utilize Agentic Workflow to advance the progress and application of AI technology.

What is Agentic Workflow?

1. According to Ng’s speech and blog, Agentic Workflow is a method for interacting with LLMs to complete tasks.

① The traditional method of interacting with LLMs involves directly inputting a prompt and having the LLM generate results based on that prompt.

② Agentic Workflow breaks tasks down into multiple steps, iterating at different stages to guide the final output.

③ The interaction process in Agentic Workflow is akin to breaking a task into multiple subtasks, guiding the LLM to complete each subtask step by step, using the output as input for the next step, and repeating this cycle.

2. The Agentic Workflow process allows models to adopt more complex and dynamic strategies while executing tasks, similar to how humans think and act when solving problems.

3. Ng summarizes the design patterns of Agentic Workflow into four categories: Reflection, Tool Use, Planning, and Multi-Agent Collaboration.

① Reflection: Agents evaluate their work and propose improvements. For example, an agent can generate a piece of code, then self-reflect on its correctness, style, and efficiency, proposing constructive suggestions for improvement.

② Tool Use: Agents utilize external tools, such as web searches and code execution, to help gather information, take action, or process data.

③ Planning: Agents propose and execute a multi-step plan to achieve goals, such as outlining a paper, conducting online research, and then drafting.

④ Multi-Agent Collaboration: Multiple AI agents work together, distributing tasks and discussing and debating ideas to propose better solutions than a single agent could.

Table: Relevant papers on the four design patterns of Agentic Workflow recommended by Ng.

Is Preparing Agents More Reliable Than Expecting GPT-5?

Is the Reflection Pattern Suitable for Coding?

Ng discusses the potential of the Reflection pattern in enhancing the performance of large language models (LLMs) in his article “Agentic Design Patterns Part 2, Reflection.”

1. The Reflection pattern automates the process of critical feedback, enabling LLMs to self-evaluate and improve their outputs. This approach mimics how humans improve their work after receiving criticism.

2. The key to the Reflection pattern is to automate key feedback steps, allowing the model to critique its own output and improve its response.

3. The article uses LLMs writing code as an example, prompting LLMs to directly generate the necessary code to perform certain tasks, followed by prompting them to reflect on their outputs.

① For instance, prompting the LLM with previously generated code and constructive feedback.

② Asking the LLM to rewrite the code using the feedback to achieve better responses.

③ This self-reflection process enables LLMs to identify gaps and improve their outputs across various tasks, including code generation, text writing, and answering questions.

4. To help agents achieve better reflection outcomes, LLMs can be provided with tools to assess outputs or use a multi-agent framework for division of labor.….

Does Tool Use Make LLMs More Practical? What Surprises Does Planning Bring? What Opportunities Does Multi-Agent Collaboration Offer?For the complete interpretation, please visit “Machine Heart PRO” Industry Newsletter · 2024 Annual #Week 15
This issue’s complete newsletter includes 3 thematic interpretations + 27 key events in the AI & Robotics sector
1. Is using evolutionary algorithms for model merging more promising than mainstream MoE technology?
Why is model merging gaining more attention? What is Model Merging? Are Model Merging, Model Fusion, and MoE the same thing? Is evolutionary algorithm + Model Merging more promising?…
2. Is preparing agents more reliable than expecting GPT-5?
What is Agent Workflow? Why is Andrew Ng re-promoting Agent Workflow? Does Agent Workflow write code better than GPT-4? What practices are involved in Agent Workflow?…

3. In-depth analysis of the 2024 MAD panorama report

What is MAD? What elements are included in the 2024 MAD panorama? What changes have occurred in the capital market for MAD in recent years? What key topics does the report focus on?…

Is Preparing Agents More Reliable Than Expecting GPT-5?

Leave a Comment