In 2025, the era of AI Agents is accelerating. While OpenAI’s Operator has yet to officially debut, China’s Zhizhu AI has quietly launched version 1.1 of GLM-PC, becoming the world’s first publicly available computer intelligent agent that is ready to use immediately. This breakthrough not only showcases China’s strong capabilities in the AI field but also demonstrates how AI Agents are transitioning from science fiction to reality, truly changing our lives and work methods.
Source of the video: https://agent.aminer.cn
Breakthrough of GLM-PC: From “Tool” to “Colleague”
The biggest highlight of GLM-PC is its ability for “deep thinking.” Unlike traditional AI tools, GLM-PC can decompose complex tasks and perform logical reasoning like a human. For example, when you ask it to send New Year greetings to WeChat friends, it first generates a detailed thought chain, gradually breaking down the task, and ultimately completes the sending. This capability is evident not only in simple tasks but also in managing complex multi-step operations such as cross-platform searches, data organization, file saving, and more.Even more astonishing is that GLM-PC has introduced a coding mechanism to execute tasks by generating functions. This code-like thought chain not only improves the accuracy and efficiency of tasks but also makes AI more rigorous and reliable when performing complex tasks. This combination of “left brain” and “right brain” allows GLM-PC to not only perceive and understand graphical interfaces but also perform logical reasoning and task planning, truly achieving “hands-on and brain-powered” operation.
Application Scenarios of AI Agents: From Life to Work
GLM-PC’s application scenarios are extensive, covering almost all aspects of daily life. For example, it can automatically send WeChat greetings, create New Year images and videos, and even search for Spring Festival customs on Xiaohongshu and organize them into an article. These tasks previously required a lot of manual operations, but now just a single sentence allows AI to help complete them.
In work scenarios, GLM-PC performs equally well. It can automatically handle documents, search the web, summarize information, and even help you schedule meetings and send meeting summaries. This automation capability not only enhances work efficiency but also reduces human error, truly achieving a “driverless” office experience.
Innovations Behind the Technology: Multimodal Perception and Code Thinking
The success of GLM-PC is inseparable from Zhizhu AI’s technological accumulation in multimodal perception and code thinking. Through the Visual Language Model (VLM), GLM-PC can perceive and understand the elements and layout in graphical interfaces, simulating human clicks and inputs. This multimodal perception ability allows AI to operate freely in complex GUI environments, greatly expanding its application boundaries.
At the same time, GLM-PC has also introduced the code generation model CodeGeex, which executes tasks by generating code. This code thinking not only improves the accuracy and efficiency of tasks but also makes AI more rigorous and reliable when executing complex tasks. This combination of “left brain” and “right brain” allows GLM-PC to perceive and understand graphical interfaces while also performing logical reasoning and task planning, truly achieving “hands-on and brain-powered” operation.
User Experience
Combination of Multimodal Perception and Code Thinking:
GLM-PC can perceive and understand graphical interfaces through the Visual Language Model (VLM) while also performing logical reasoning and task planning through the code generation model CodeGeex. This combination of “left brain” and “right brain” allows GLM-PC to be more rigorous and efficient in executing complex tasks. In contrast, Operator mainly relies on visual perception and simple task execution, lacking the deep logical reasoning ability of GLM-PC.
Cross-Platform Operation Capability:
GLM-PC can operate not only in browsers but also directly control local applications on the computer, such as WeChat, Word, Excel, etc. This cross-platform operation capability makes the application scenarios of GLM-PC even more extensive. Meanwhile, Operator is currently limited to web-based operations and cannot directly control local applications.
Free and Open:
GLM-PC is now open to the public, allowing users to download and use it for free. In contrast, Operator is expected to be tied to ChatGPT, possibly requiring payment for use. This open strategy enables GLM-PC to accumulate user feedback more quickly and accelerate technological iterations.
Operating Speed:
Although GLM-PC’s operating speed is close to that of humans, it still appears slightly slower in certain scenarios. For instance, when handling complex tasks, GLM-PC may take a few seconds to generate thought chains and execute tasks. In demonstrations, Operator has shown a faster response speed, capable of completing complex tasks in a short time.
Task Success Rate:
GLM-PC may encounter errors or get stuck in loops when handling certain complex tasks. For example, in cross-platform price comparison tasks, GLM-PC may fail due to homophones or interface changes. In demonstrations, Operator has shown a higher task success rate, able to complete complex tasks more stably.
User Experience:
GLM-PC’s user interface and interaction design are relatively simple, requiring users to input very precise commands to achieve satisfactory results. In demonstrations, Operator has shown a friendlier user interface and interaction design, allowing users to complete tasks easily through natural language commands.
Challenges and Future of AI Agents
Despite GLM-PC’s impressive capabilities, it still faces some challenges. For example, when dealing with homophones and complex tasks, AI may make errors or get stuck in loops. Additionally, while AI’s operating speed is close to that of humans, it still appears slightly slower in certain scenarios. These issues need to be resolved through continuous technological iterations and optimizations.
However, the emergence of GLM-PC undoubtedly points the way for the future of AI Agents. With continuous technological advancements, AI Agents will not only be limited to simple task execution but will also possess stronger autonomous learning abilities and creativity. In the future, AI Agents may become our “super assistants,” helping us complete more complex tasks and even surpass human capabilities in certain areas.
Case Support: Practical Applications of GLM-PC
To better understand the capabilities of GLM-PC, here are some specific application cases:
-
Automatic Sending of WeChat Greetings: Users can tell GLM-PC, “Send New Year greetings to all members of the ‘Night Owl Test’ group on WeChat.” GLM-PC will gradually break down the task, generate a detailed thought chain, and ultimately complete the sending automatically. Each group member receives a customized greeting, with different content.
-
Cross-Platform Search and Organization: Users can ask GLM-PC, “Search for Spring Festival customs on Xiaohongshu, get the top three images and text introductions, expand them into an article, and save it to a Word file on the desktop.” GLM-PC will automatically open Xiaohongshu, search for relevant content, organize the information, generate the article, and ultimately save it to the specified location.
-
Automatic Meeting Scheduling: Users can tell GLM-PC, “Schedule a meeting for 3 PM tomorrow and send meeting invitations to relevant personnel.” GLM-PC will automatically open the calendar, schedule the meeting time, and send invitation emails to specified personnel.
-
Cross-APP Price Comparison: Users can request GLM-PC, “Compare prices for the iPhone 15 on Taobao and JD, and save the results to an Excel file.” GLM-PC will automatically open both e-commerce platforms, search for the specified product, compare prices, and organize the results, ultimately saving them to an Excel file.
These cases demonstrate the powerful capabilities of GLM-PC in practical applications, enhancing work efficiency and providing great convenience to users.
Official Product Website
If you are interested in GLM-PC, you can visit the official website of Zhizhu AI for more information and to download the experience:
-
GLM-PC Official Website: https://cogagent.aminer.cn/home -
AutoGLM Official Website: https://agent.aminer.cn/
The Power of China in the Era of AI Agents
The launch of GLM-PC not only showcases China’s strong capabilities in the AI field but also demonstrates how AI Agents are transitioning from science fiction to reality. Zhizhu AI has successfully created an AI Agent that can think and operate like a human through the combination of multimodal perception and code thinking. This technological breakthrough not only changes how we interact with machines but also opens up new possibilities for future AI applications.
2025 is destined to be the “year of AI Agents.” In this global AI competition, China has already taken the lead. The emergence of GLM-PC not only reveals the potential of AI Agents but also fills us with anticipation for the future. As Zhizhu AI has shown, the era of AI Agents has arrived, and China is leading this transformation.