
Source: Machine Heart
This article is about 2000 words, suggested reading time is 5 minutes.
Recently, a joint team from Tsinghua University, the University of Hong Kong, Renmin University, and Wallface Intelligence proposed a new self-evolution strategy for intelligent agents: Investigate - Consolidate - Exploit (ICE).
“Learning from history can help us understand the rise and fall of civilizations.” The history of human progress can be seen as a continuous process of self-evolution that absorbs past experiences and pushes the boundaries of capabilities. In this process, we learn from past failures to correct mistakes and draw on successful experiences to enhance efficiency and effectiveness. This self-evolution process is ubiquitous in our lives: from how to summarize experiences to better solve problems at work, to how to use patterns to more accurately predict the weather, we are constantly learning and evolving from past experiences.
Successfully extracting knowledge from past experiences and applying it to future challenges is an important milestone in the path of human evolution. So, in the era of artificial intelligence, can AI agents do the same?
In recent years, language models such as GPT and LLaMA have demonstrated their remarkable abilities in solving complex tasks. However, although they can use tools to solve specific tasks, they fundamentally lack insights and learnings from past successes and failures. It’s like a robot that can only complete specific tasks; while it performs excellently in the current task, it cannot draw on past experiences to provide assistance when faced with new challenges.
To address this issue, a recent joint team from Tsinghua University, the University of Hong Kong, Renmin University, and Wallface Intelligence proposed a new self-evolution strategy for intelligent agents: Investigate – Consolidate – Exploit (ICE). It aims to enhance the adaptability and flexibility of AI agents through cross-task self-evolution. It not only improves the efficiency and effectiveness of agents when handling new tasks but also significantly reduces the demand for the capabilities of the agent’s foundational model.
The emergence of this strategy undoubtedly opens a new chapter for the self-evolution of agents, also indicating that we are one step closer to achieving full autonomy for agents.

-
Paper Title: Investigate-Consolidate-Exploit: A General Strategy for Inter-Task Agent Self-Evolution -
Paper Link: https://arxiv.org/abs/2401.13996


-
In the exploration stage, the agent records the entire tree-structured task planning and simultaneously dynamically detects the execution status of each sub-goal; -
In the consolidation stage, the agent first removes all failed target nodes, and then for each successfully completed goal, the agent arranges all leaf nodes as a sub-tree of that goal to form a planning chain (Workflow); -
In the exploitation stage, these planning chains will be used as a reference basis for refining new task goals to utilize these past successful experiences.

-
In the exploration stage, the agent dynamically records the tool invocation chain for each task and performs simple detection and categorization of potential issues that arise during tool invocation; -
In the consolidation stage, the tool invocation chain will be transformed into a pipeline structure similar to an automaton, fixing the order of tool invocation and the transition relationships between invocations while also removing duplicate calls and adding branching logic, making the automaton’s execution process more robust; -
In the exploitation stage, for similar goals, the agent will directly automate the execution pipeline, thereby improving task completion efficiency.
-
The ICE strategy can significantly reduce the number of model invocations, thereby enhancing efficiency and reducing costs. -
The stored experiences have a high reuse rate under the ICE strategy, proving the effectiveness of ICE. -
The ICE strategy can improve the completion rate of sub-tasks while reducing the number of planning revisions. -
With the support of past experiences, the requirements for model capabilities in task execution have significantly decreased. Specifically, using GPT-3.5 along with previous task planning and execution experiences can yield results comparable to GPT-4.


Editor: Wenjing
