AI Agents: Building a New Smart Life Landscape

“Can you design a one-day tour plan for Beijing?” Recently, at the 2024 World Intelligent Connected Vehicles Conference, Mr. Li, who experienced the BAIC AI Agent on the ARCFOX Alpha S5, felt he had a “travel consultant” at his service, saying, “With just a voice command, the AI agent can automatically plan the route, which is very convenient.”

In recent years, the emergence of AI (Artificial Intelligence) large model technology has sparked a new wave of AI research, and now AI agents are becoming a new industry hotspot. From voice assistants in smart cars to digital human hosts in online live broadcasts, AI agents are deeply transforming the application ecosystem with their unique autonomy and interactivity, continuously building a new smart life landscape.

Transforming Human-Machine Interaction

As the name suggests, an AI agent is an intelligent entity that possesses AI capabilities. It can be a hardware device or a software system. It can perceive the environment, make decisions, and execute actions based on its AI capabilities to ultimately achieve specific goals.

“In simple terms, an AI agent is like a ‘little assistant’ that has intelligence and emotional understanding, can comprehend, and is helpful,” said Chen Hao, Deputy Director of the Advanced Technology Center at the Beijing General Artificial Intelligence Research Institute. This “little assistant” not only understands human language but also continuously improves its skills in specific areas through learning and data analysis.

Why has the AI agent become a focal point in the industry? What is its relationship with large model technology?

A relevant person in charge of ByteDance’s Doubao large model stated in an interview that AI agents are based on large model technology, which allows AI agents to “have hands and feet,” enabling them to work and execute tasks independently, while the large model serves as their “brain.”

However, AI agents are a more “three-dimensional” intelligent system. In addition to providing widely used language communication services of large models, AI agents can also perform intelligent reasoning and emotion analysis based on context and mimic human behavior to perform corresponding actions.

For example, when given the command “help me cook a dish,” a “large model chef” can only output a recipe and list the required ingredients; however, an “AI agent chef” can not only provide the recipe but also select the most suitable ingredients for automatic ordering based on the command giver’s taste preferences and nutritional needs, and even monitor the cooking process to ensure the quality and taste of the food.

“Traditional human-machine dialogue is often limited by fixed patterns and preset rules, making it difficult to achieve truly natural communication,” pointed out Liang Zhixiang, Senior Vice President of Baidu Group. With the four capabilities of understanding, generation, logic, and memory based on large models, AI agents can now simulate a dialogue style that is much closer to real human conversation, making “human-machine interaction” as smooth and natural as “human-to-human dialogue.”

In fact, benefiting from the versatility and scalability of large models, the threshold for using AI agents has been greatly lowered. Whether for large enterprises, small and medium-sized enterprises, or even individual developers, there is no need for new hardware or a large amount of additional training data to quickly build their own AI agent applications.

Recently, Baidu’s “Wen Xiaoyan” large model app launched a new feature called “create an agent with one sentence.” Everyone can create their own AI agent based on their needs, with personality, tone, and identity settings depending on the user’s personalized choices. Creators can have video conversations with their “exclusive agents,” practice English speaking, and even simulate job interviews. According to relevant statistics, the Baidu Wenxin AI agent platform has attracted 100,000 enterprises and 600,000 developers, covering hundreds of application scenarios.

“In the future, if users can more easily create and use their own AI agents, this will truly unleash the value of AI agents,” Liang Zhixiang said. “Next, we will accurately and efficiently distribute AI agents to more users, allowing everyone to become a ‘developer’ of AI agents.”

Expanding Application Scenarios

Currently, a series of AI agent technologies are flourishing, and application scenarios are continuously expanding.

“A year and a half ago, BAIC ARCFOX began researching AI agents, mainly applied in enhancing R&D efficiency, standard language compilation, and user services,” said Feng Shuo, Director of the Intelligent Connected Center at BAIC Research Institute. The AI-enabled cockpit has moved away from the old model of mechanical, fixed command “human-machine Q&A” to achieve flexible and customized “intelligent interaction.” For example, the AI agent will arrange a schedule based on the work habits of the driver and passengers, capturing their preferences and emotions to recommend music, movies, etc.

When detecting that the driver is overly fatigued, the AI agent will quickly generate a service plan that includes reserving parking spaces, adjusting the in-car environment, and setting rest durations, providing users with a safer intelligent driving experience. “In the future, AI agents are also expected to include functions like ‘one-sentence food ordering’ to make it more convenient for drivers and passengers,” Feng Shuo said.

At the same time, AI agent technology is rapidly developing and gradually being implemented in various small terminal devices.

“Doubao Doubao, who is this Arhat in the temple?” “This is Mahakasyapa, one of Shakyamuni’s ten disciples…” recalled Xiao Fan, a self-media operator, who often had such Q&A exchanges with the Ola Friend earphones during his visit to Guoqing Temple in Taizhou, Zhejiang during the National Day holiday.

It is understood that Ola Friend is the first AI agent earphone released by ByteDance’s Doubao large model. In addition to regular vocal playback functions, it can also provide instant assistance to users in information inquiries and travel scenarios.

The person in charge of the Doubao large model stated that Ola Friend can turn into a “personal tour guide” for users at any time, and users can also ask follow-up questions based on their interests. For example, while visiting an art exhibition, users can ask Ola Friend to introduce a specific exhibit and further inquire about the creator’s artistic style and other representative works, gaining more knowledge through this Q&A interaction.

This year, more and more smartphone manufacturers have joined the AI agent layout. Vivo recently released a smartphone AI agent named PhoneGPT, which can accurately operate smartphone applications based on user intent to complete tasks such as making calls, sending texts, and reserving restaurants, greatly enhancing user experience. Huawei upgraded its smart assistant Xiao Yi to a system-level AI agent, not only improving its Q&A capabilities but also enhancing its perception and reasoning abilities. OPPO launched the “1+N” AI agent ecological strategy, consisting of an AI super agent and an AI Pro development platform, aimed at providing personalized service models that better match user preferences.

In commercial service scenarios, AI agents are deeply interacting with consumers.

Baidu’s e-commerce digital human live streaming platform “Huibo Star” can generate a sales AI agent in just five minutes, which can not only be online 24 hours a day but also achieve complete automation in the live streaming room. Digital human hosts and digital human assistants perform their respective roles, promptly answering consumer questions and demonstrating and explaining products in a smooth and natural manner. For questions that cannot be answered verbally in time, there is also an AI assistant for text replies.

“Thanks to digital human live streaming AI agent technology, the e-commerce live streaming industry has effectively alleviated issues such as high costs, time constraints, and unstable quality.” Liang Zhixiang stated that so far, “Huibo Star” has helped tens of thousands of businesses achieve revenue growth, averaging a 62% increase in total merchandise transaction value.

Currently, AI agents are also being applied in various other scenarios, such as programming, content creation, and industrial manufacturing, demonstrating strong application potential and market value.

Bringing More Possibilities to Future Life

Many industry insiders believe that AI agents will be the future trend.

Tencent’s report on “2024 Digital Technology Frontier Application Trends” suggests that large models will move towards multimodality, and AI agents are expected to become the next-generation platform. The international management consulting firm Accenture stated in its “Technology Outlook 2024” report that 96% of corporate executives believe AI agents will bring significant development opportunities to their companies in the next three years.

Industry insiders indicate that in the foreseeable future, AI agents will help multiple industries build a new normal of intelligent operations centered on “human + AI digital employees.” For instance, in the medical field, AI agents can assist doctors in diagnosis, treatment, and health management; in the transportation field, AI agents can provide scientific basis for traffic management and planning by analyzing data and real-time traffic conditions; in the education field, AI agents can provide intelligent tutoring and adaptive learning systems to help students better grasp knowledge.

Experts point out that as machine learning and deep learning technologies continue to advance, the characteristics and learning capabilities of AI agents will become increasingly powerful, better adapting to the complex and changing real world, bringing more possibilities for social development.

Although AI agent technology has brought more possibilities for future life, it is still in its infancy—existing AI agents can only perform relatively simple and fixed tasks, and their application functions are severely homogenized.

Some viewpoints suggest that one of the bottlenecks in AI agent development is that current large models lack sufficient reasoning capabilities, making it impossible to truly solve complex problems without human intervention. Large model technology itself has inherent unpredictable defects due to algorithms and other factors, which can pose a series of security risks for AI agents.

In addition to technical risks, AI agents also face ethical and privacy issues. Industry insiders state that AI agents collect a large amount of data while providing services, which may lead to the leakage of personal privacy information, such as AI agents potentially inferring certain private preferences based on users’ shopping habits. This kind of “spying” behavior is undoubtedly an invasion of user privacy.

Experts believe that it is necessary to quickly classify and manage AI agents based on their functional purposes and usage time, especially to continuously supervise the development, production, and application deployment of high-risk agents, and to timely formulate relevant laws and regulations to improve existing internet standards, thereby better preventing various risks posed by AI agents.

Source: Internet Information Tianjin

AI Agents: Building a New Smart Life Landscape

Leave a Comment