In today’s rapidly advancing technological era, the term “intelligent agent” has gradually entered the public eye, becoming a hot topic of discussion. So, what exactly is an intelligent agent?
An intelligent agent, known in English as an AI Agent, is a system that mimics human intelligent behavior. It is like a “smart brain” with rich experience and knowledge, capable of perceiving its environment and autonomously planning, making decisions, and taking actions to achieve specific goals based on its perceptions. In simple terms, an intelligent agent can make decisions based on external inputs and continuously optimize its behavior through interaction with the environment.
1. Intelligent Agent=Large Model+Planning+Memory+Tools.
From a core composition perspective, an intelligent agent is based on a large model and continuously enhances its capabilities through active learning or knowledge acquisition. It can be said that an intelligent agent = large model + planning + memory + tools. With the rapid iteration of large language models (LLM), the intelligent agent market is also expanding rapidly. In 2023, the global intelligent agent market was valued at $3.86 billion and is expected to grow at a compound annual growth rate of 45.1% from 2024 to 2030.
When working, an intelligent agent needs to possess three key elements: perception, decision-making, and execution. Through perception, it can acquire data information from the external environment; through decision-making, it can formulate corresponding action strategies; and through execution, it can take concrete actions to complete tasks. More advanced intelligent agents also possess learning capabilities, allowing them to adjust and optimize their behavior based on continuous feedback.
For example, the real-time Agent can be seen as an intelligent agent. When we converse with it, it perceives the questions we input (perception), undergoes a complex internal analysis and decision-making process to think about how to respond, and finally presents the answer to us (execution). Moreover, with interactions with numerous users, its responses to various questions continue to optimize (learning capability).
In summary, an intelligent agent is like a mysterious yet powerful “digital partner,” quietly transforming our lives and work with its unique charm and limitless potential. Next, let us delve into its fascinating world.
2. Differences Between Language Models and Intelligent Agents
In the field of artificial intelligence, while language models and intelligent agents are related, their differences are also quite significant. Figuratively speaking, a language model is like a friend who is particularly good at “continuing conversations”; you say half, and it can help you complete it, you ask a question, and it will answer seriously; whereas an intelligent agent is like a good companion to the “model,” possessing a certain degree of “self-awareness” and existing to assist the model in answering questions more accurately and executing tasks.
Next, let us explore the differences between the two in depth.
(1) Limitations of LLM
Large language models (LLM), as important technologies in the current natural language processing field, utilize word embeddings and transformer architecture to perform advanced natural language processing tasks, demonstrating excellent understanding of human language. However, they have certain limitations. The knowledge base of LLM is fixed and determined at the time of training completion, which means it cannot answer questions outside its knowledge base.
For example, if the knowledge base is limited to data up to 2023, it would struggle to provide accurate responses to events occurring in 2024. Even if an external knowledge base is provided for the model, such as using Google search to expand its knowledge base to the entire internet, there arises a tricky problem: not all queries require retrieval. Some questions the large language model already knows the answers to, but how to distinguish these questions presents a challenge.
(2) The Emergence of Intelligent Agents
The emergence of intelligent agents effectively addresses the limitations faced by LLM. Specifically, intelligent agents can perform the following key operations:
First is question classification. Intelligent agents can classify questions into predetermined categories to determine whether a query requires retrieval, thus deciding whether to use tools subsequently. Before creating an intelligent agent, it is usually necessary to plan a question category table in advance and train a dedicated “classification model.” For example, when a user inputs “Translate the following sentence into English: …”, the classification model can categorize this question as “translation”; if the input is “I am going to xx tomorrow, please give me clothing advice,” it can classify it as “advice.” This classification lays the foundation for accurately processing questions later.
Secondly, tool usage. Once an intelligent agent determines that its inherent knowledge base cannot answer a user’s question based on the classification model’s results, it will attempt to use tools to access external knowledge bases. In practical applications, tools like langchain are often used to implement this process. The new version of Ollama can even directly use tools. When an intelligent agent obtains an external knowledge base, it will also employ RAG technology to slice and vectorize knowledge, ultimately using algorithms to stitch together the most relevant knowledge as context to support accurate answers.
Furthermore, model selection. Generally, large language models have broad knowledge but lack specialization. They can handle most common questions with ease but often struggle to provide satisfactory answers to more specialized and in-depth questions. At this point, intelligent agents can prioritize selecting specialized small language models for the first round of answers. Training a highly specialized small model in a certain field not only leads to higher quality answers but also reduces resource consumption, achieving effective resource utilization.
Finally, answer optimization. While small models may provide accurate answers to specialized questions, they may not be as fluent as large language models in terms of text organization. Intelligent agents can input the first round of answers from small models into large language models for summarization and rewriting. When necessary, they can use tools to add context, generating answers that are more coherent (suitable for scenarios requiring copywriting output) or more concise (suitable for scenarios requiring cost-saving).
In summary, intelligent agents, with their unique functions, compensate for the shortcomings of LLM, making artificial intelligence more intelligent and efficient in handling various issues.
3. Specific Operations of Intelligent Agents
(1) Question Classification
When an intelligent agent receives a user’s input question, it must first classify the question. This is similar to organizing files, needing to place different types of files into corresponding folders. Before creating an intelligent agent, the development team will carefully plan a question category table that covers various possible question types. Additionally, a dedicated “classification model” will be trained, which acts as a “question recognition expert” that understands natural language and excels in text classification.
For example, when a user asks, “How to create an efficient workplace plan?” the classification model can categorize it as “Workplace – Planning”; if the user asks, “Help me write a travel post for Xiaohongshu,” it will be classified under “Copywriting – Platform-specific Copywriting.” Through this classification method, the intelligent agent can clearly determine whether the question requires querying an external knowledge base, guiding subsequent processing accurately.
(2) Tool Usage
Once the intelligent agent determines that its inherent knowledge base cannot answer the user’s question based on the classification model’s results, it will act like a smart explorer, attempting to use tools to access external knowledge bases. In practical applications, tools like langchain are commonly used to implement this process, and the new version of Ollama has the capability to use tools directly.
When the intelligent agent obtains external knowledge bases, it also needs to process this knowledge to better serve the answering of questions. At this point, RAG (Retrieval-Augmented Generation) technology comes into play. RAG technology acts like a meticulous craftsman, slicing knowledge and vectorizing these slices, transforming knowledge into a format that is easier for computers to understand and process. Finally, through clever algorithms, the most relevant knowledge is stitched together as context, providing rich and powerful support for the intelligent agent to answer questions accurately. For instance, when a user inquires about the latest developments in a certain emerging technology, and the intelligent agent’s own knowledge base lacks relevant information, it can use tools to access content from external news sites, professional forums, etc., and after processing with RAG technology, present accurate and comprehensive answers to the user.
(3) Model Selection
Generally, while large language models possess broad knowledge, akin to a knowledgeable generalist who can respond fluently to most common questions, when faced with more specialized and in-depth issues, it is like asking a general practitioner to solve a complex specialist case, often failing to provide satisfactory answers. At this point, the intelligent agent’s “model selection” function becomes particularly important, as it can prioritize choosing specialized small language models for the first round of answers.
This is because training a highly specialized small model in a specific field is akin to cultivating an expert specialized in a particular domain, making it easier to obtain quality answers for specific domain questions. Furthermore, small models consume fewer resources during operation, allowing for better effective utilization of resources, avoiding the potential “overkill” and resource waste that large language models might encounter when addressing specialized issues. For example, in the medical field, when a user inquires about the latest treatment options for a rare disease, a small language model specifically trained in the medical field may provide more accurate and professional answers than a general large language model. Moreover, large and small models are not completely independent; they can complement each other’s strengths. The large model excels in knowledge breadth and fluency in language understanding and generation, while the small model performs excellently in professional depth, allowing the intelligent agent to respond more adeptly to various questions.
(4) Answer Optimization
While small models may provide accurate answers to specialized questions, in terms of text organization, they may be like a highly capable but ineloquent expert, lacking the fluency and naturalness of large language models. At this point, the intelligent agent’s “answer optimization” step plays a crucial role.
The intelligent agent can input the first round of answers provided by the small model into the large language model for summarization and rewriting. In scenarios requiring copywriting output, the large language model can polish the answers to make them more fluent and engaging, much like refining a content-rich but linguistically bland article, making it more appealing to readers’ habits and aesthetic needs. For instance, when drafting product promotional copy, an answer optimized by the large language model can vividly showcase product features and attract consumer attention. Conversely, in scenarios requiring cost-saving, the large language model can condense the answers, eliminating unnecessary expressions while ensuring that key information remains intact, thus reducing resource consumption. For example, in situations with strict word limits, such as text message replies or brief comments on social media, concise answers can accurately convey information without exceeding limits.
4. Application Examples of Intelligent Agents
Taking the example of the Wenxin intelligent agent building the “All Beasts Can Become Black Myth” intelligent agent, we can clearly see the excellent performance of intelligent agents in practical applications.
After its global unlock on August 20, 2024, the game “Black Myth: Wukong” achieved remarkable success, with total sales exceeding 10 million copies across all platforms and a peak concurrent user count of 3 million by 9 PM Beijing time on August 23, 2024. Amidst this excitement, someone utilized the Wenxin intelligent agent to create the “All Beasts Can Become Black Myth” intelligent agent, bringing unique experiences to users.
In terms of effectiveness, this intelligent agent is simple to use; users need only input an animal name to generate a personified character in the style of Black Myth with one click. For example, inputting “tiger” yields a personified image of a tiger, featuring a beast’s head and human body, an angry expression, a fierce appearance, a tall stature, draped in a red cloak and battle robe, wearing golden armor, wielding a stick-like weapon, and set against a dark lava background, presenting a visual effect akin to a movie poster, realistic in style and highly detailed, filled with visual shock.
In the implementation process, clarity of thought is essential. To achieve the goal of automatically generating ideal images, drawing prompts must be utilized. First, a prompt template is preset, leaving a space for the user to input the animal name. After the user inputs the name, it is filled into the template to generate a complete prompt, which is then used to call the drawing plugin to output the image. The overall process is: the user inputs a natural language description, extracts the animal name, completes the prompt, calls the drawing plugin to generate the image, and finally provides the result back to the user.
The specific practical tutorial is as follows:
Building the Workflow: Operate on the Baidu Wenxin intelligent agent platform. First, create a workflow and enter the workflow orchestration page. At the [Start] node, add input parameters {‘parameter name’:‘animal’, ‘parameter type’:‘String’, ‘parameter description’:‘animal’}, used to extract the “animal” keyword from user input text. At the [Large Model] node, write an appropriate drawing prompt template, such as “full body shot, {{input}} personified, {{input}} personified, beast’s head and human body, angry expression, fierce appearance, bright eyes, tall stature, red cloak, red battle robe, two meters tall, wielding a stick-like weapon, golden armor, epic, astonishing epic ancient Chinese theme, styled by Chinese artists, dynamic, martial arts style, dark lava background, movie poster style, stunning, realistic style, highly detailed, photo-realistic, vivid, striking, 3D render, 8K, Octane render, Unreal Engine 5, CryEngine, realistic lighting and shadows, strong contrast in light and shadow, cinematic lighting, high quality, high detail, ultra-high definition”, ensuring the output “animal” name from node 1 can be embedded, maintaining consistent image style. Next, at the [ImageCreateV2] node, insert the “AI Drawing Assistant” plugin and complete the configuration as required. Finally, at the [End] node, output the image and text content in the specified format, completing the workflow construction.
Building the Intelligent Agent: After completing the workflow construction, guide users to input the animal name, calling the established workflow to consistently output images. Then, further optimize the intelligent agent packaging to enhance user experience.
Through the example of the “All Beasts Can Become Black Myth” intelligent agent, it fully demonstrates the ability of intelligent agents to provide unique and high-quality services based on demand, utilizing clever process design and technical application to meet diverse creative and usage needs.
5. Future Prospects of Intelligent Agents
With the rapid development of technology, the future of intelligent agents is filled with infinite possibilities, and their development prospects are extremely broad, expected to shine in many new fields.
In the healthcare sector, intelligent agents may become valuable “assistants” to doctors. They can quickly analyze various examination data of patients, such as medical records and imaging data, assisting doctors in making more precise diagnoses, and can develop personalized treatment plans based on individual patient differences. For instance, for cancer patients, intelligent agents can integrate the latest medical research findings and a wealth of past treatment cases to provide doctors with optimal treatment suggestions, even predicting potential adverse reactions during treatment to help doctors prepare in advance. Moreover, during the patient’s recovery phase, intelligent agents can serve as health management consultants, crafting personalized rehabilitation plans that include guidance on diet, exercise, and more, while tracking the patient’s recovery progress in real-time and making dynamic adjustments based on actual conditions.
The education sector will also undergo profound changes due to the deep integration of intelligent agents. Intelligent agents are expected to become personalized learning partners for every student, tailoring individualized learning paths based on students’ learning progress, knowledge mastery, and learning styles. For example, when a student encounters difficulties in learning mathematics, intelligent agents can analyze the types of mistakes and problem-solving approaches to identify weak points in their knowledge, then provide targeted explanations of relevant knowledge points, practice questions, and additional learning resources. In language learning, intelligent agents can simulate real language environments, engaging in conversation practice with students to correct pronunciation and enhance oral expression skills. Moreover, intelligent agents can monitor students’ learning states in real-time, providing encouragement and guidance when they detect fatigue or emotional fluctuations, stimulating students’ motivation to learn.
Intelligent agents also hold enormous potential in the smart home sector. In the future, intelligent agents will achieve smarter and more humanized control and management of household devices. Imagine waking up in the morning, and the intelligent agent has already prepared suitable clothing for you based on the day’s weather and your schedule, automatically adjusting the indoor temperature, humidity, and lighting to create a comfortable waking environment. After you leave home, the intelligent agent can monitor the home’s security status in real-time, sending alerts if it detects anomalies, such as open doors or windows or the presence of strangers, and coordinating relevant devices to take action. When you communicate with the intelligent agent via your phone on your way home from work, it can pre-cool the air conditioning and reheat dinner, ensuring you enjoy a comfortable homecoming.
In urban planning and management, intelligent agents will also play a crucial role. They can collect and analyze data on urban traffic, energy consumption, population movement, and more, providing scientific decision-making support for urban planners. For example, by analyzing traffic flow data in real-time, intelligent agents can optimize traffic signal settings to alleviate congestion; based on energy consumption data, they can propose energy-saving and emission-reduction suggestions to promote sustainable urban development. Additionally, in response to natural disasters and emergencies, intelligent agents can quickly integrate various resource information, formulate emergency rescue plans, and deploy rescue forces, enhancing a city’s emergency response capabilities and disaster management efficiency.
In conclusion, the future development prospects of intelligent agents are promising. They will act as catalysts, accelerating innovation and transformation across various fields, bringing more convenience, efficiency, and beauty to our lives. We have reason to believe that in the near future, intelligent agents will fully integrate into our lives, becoming an important force driving social progress.
data:image/s3,"s3://crabby-images/6aa28/6aa28a3ddd4f704f6c40d8dc56976727a4a593a7" alt="Exploring Future Intelligence: Differences Between Agents and Large Models"
In the upper left corner, “Hai Na Digital Intelligence Research Institute“, thank you for your attention!