In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem

In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem

Huawei released the latest Pangu Model 5.0 during the HDC2024 keynote speech. What are the differences in the new version of the model?
The highlights of this year’s Huawei Developer Conference (HDC) are definitely not just the HarmonyOS Next operating system. Yu Chengdong (Executive Director of Huawei, Chairman of the Terminal BG, and Chairman of the Intelligent Vehicle Solutions BU) presented a diagram of Huawei’s capability stack framework during the keynote speech, where each element can be considered a highlight—this diagram also helps us understand the levels and interrelationships of Huawei’s different capabilities.
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
This release not only mentions the Pangu model itself but also Huawei’s overall capabilities in the AI HPC field, including Ascend servers and Huawei Cloud infrastructure. Although Huawei did not mention Nvidia during the release, it is evident that its AI HPC capabilities are comparable to Nvidia in various aspects.
For instance, the Pangu model’s support for autonomous driving, the ability to generate synthetic data for AI training in autonomous driving, which is also a capability of Nvidia’s DRIVE Sim; assistance in industrial and architectural design, even directly outputting 3D format files, reminiscent of Omniverse for design; embodied intelligent models for robotics, similar to Nvidia’s Isaac; and weather forecasting with kilometer-level precision, resembling Earth-2; even drug discovery, railway maintenance, and more…
Thus, we say that in the direction of AI, Huawei indeed seems determined to become an alternative to Nvidia; and compared to other existing AI ecosystems, Huawei is also one of the fastest market participants domestically.
Pangu 5.0 Overview: Multimodal Reinforcement is the Highlight
From the development of the Pangu model, it should be a more industry-application-oriented foundational model. Huawei Cloud CEO Zhang Pingan stated that the Pangu model has currently entered over 30 industries and 400 application scenarios in China, including those mentioned later such as automotive, healthcare, transportation, climate, and heavy industry.
The newly released Pangu 5.0 emphasizes three aspects of enhancement: full series, multimodal, and “strong thinking”.
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
The full series refers to the adaptation from large-scale data center clusters to edge devices. Therefore, Pangu Model 5.0 has the E series (Embedded)—targeting mobile phones and PCs, with parameter counts in the billion range; P series (Professional)—with parameter counts in the hundred billion range; U series (Ultra)—with a parameter count in the trillion range, serving as a “general foundation for enterprise models that can handle complex tasks”; and S series (Super)—with a trillion parameter model, capable of “handling complex tasks across domains”.
Multimodal refers to the ability to understand different types of data, including not just text, images, and videos, but also types of sensory data such as visible light, infrared, radar, and point clouds. There is also multimodal generation capability, “Pangu 5.0 can generate multimodal content that conforms to physical laws,” such as the synthetic data generation for autonomous driving mentioned later.
The third aspect, thinking upgrade, refers to complex reasoning capabilities. Zhang Pingan stated, “We have deeply integrated thinking chains and strategy searches, greatly enhancing Pangu’s ability to plan complex tasks.”
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
One of the most impressive examples regarding multimodal data understanding is inputting a 10k pixel-level resolution of the “Along the River During the Qingming Festival” into the Pangu model and asking, “How many people are in Zhao Taichang’s family?” The Zhao Taichang’s family, from the image, occupies less than 1/200 of the entire painting. Pangu can provide the correct answer by analyzing the text on the plaque and the number of characters present.
Besides images, as mentioned earlier, Pangu 5.0 supports multimodal inputs of visible light, infrared, radar, and point clouds. For example, through satellite remote sensing visible light images, it can analyze the growth and harvest conditions of crops in a certain area. Understanding infrared images for urban traffic management can identify vehicle and pedestrian trajectories. Understanding radar images combined with visible light can determine vegetation coverage, beneficial for ecological monitoring.
Specifically in terms of technology, Noah’s Ark Lab Director Yao Jun mentioned that for visual multimodality, different visual domains have their own independent encoding schemes, “which can have redundancies and conflicts.” “We distilled these encoders into a single visual encoder to enhance model representation capabilities and accuracy”;
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
He also mentioned the “dynamic resolution” of visual inputs, proposing a training paradigm of scale generalization, using low-resolution images and simple tasks to train basic perception capabilities; using medium to high-resolution images with fine-grained perception tasks, including image understanding, to achieve fine-grained perception capabilities;
Then extending to higher resolutions and more task types… ultimately “focusing on breaking through the model’s high-order reasoning capabilities,” as a kind of step-by-step data course learning method, “learning multimodal information from easy to difficult.” The final result is “exceeding the capabilities of equivalent models in the industry,” meaning higher multimodal data processing capabilities.
As for the “thinking upgrade,” the main focus is on multi-step reasoning and complex task processing capabilities—developers attempting to write algorithms assisted by large models should be well aware that large models are relatively poor at handling complex logical problems. “Models should also think slowly about problems, just like humans do.”
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
That is to decompose and deduce step by step. “We propose the MindStar method based on multi-step generation and strategy search,” “first decomposing complex reasoning tasks into multiple sub-problems, which will generate multiple candidate solutions; through search and process feedback reward models, selecting the optimal multi-step answer path.”
Yao Jun stated that using the MindStar technology with a hundred billion model can achieve the reasoning capabilities of industry-level trillion models. “It’s like using slow thinking to bring more than ten times the parameter gain.” “Applying the MindStar strong thinking method to larger models can gradually get closer to and surpass human reasoning on complex tasks.”
Listing Several Industry Scenarios: High-speed Rail, Automotive, Robotics
As mentioned at the beginning of the article, Huawei listed several industry applications of Pangu 5.0, mainly related to passenger transport, automotive, industrial and architectural design, robotics, and heavy industry. Here are a few examples that left a deep impression on us.
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
First is the collaboration with the Beijing Railway Institute, where the “Pangu Eye” (a robot equipped with numerous sensors) is used in conjunction with the Pangu model for high-speed rail inspections. A single train has over 32,000 fault detection points, requiring hundreds of workers to complete daily inspection tasks. The new version of the Pangu model has added more dimensions of data understanding capabilities, solving many problems that traditional machine vision cannot address, such as using 3D information to determine whether a nut is loose or using spectral analysis to identify oil or water leaks.
The Beijing Railway Institute stated that the Pangu high-speed rail model trains and fine-tunes on massive amounts of high-speed rail industry data; it combines 2D images + 3D point clouds + laser spectra for multimodal data fusion diagnostic technology; and generates rare fault samples in high-speed rail scenarios through data feedback, achieving a fault recognition accuracy rate further reaching 99%+ through “learning while using.”
It is said that in the future, such intelligent technology is prepared to extend from trains and high-speed rail to more rail transit business scenarios such as urban rail and subways.
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
Another example is related to autonomous driving, primarily reflecting the multimodal generation capabilities of Pangu 5.0. We know that when AI is applied to autonomous driving, cars can learn and test in a virtual digital world—showing various complex scenarios to the car’s autonomous driving system so that it can cope with different situations in actual operation. Typically, to do well in autonomous driving requires high-quality video driving data of hundreds of thousands to millions of kilometers.
However, high-quality natural data is often insufficient, which necessitates the generation of so-called “synthetic data.” Zhang Pingan highlighted the capabilities of Pangu 5.0 in this area, “By understanding physical laws through controllable spatiotemporal generation technology, we can generate large-scale driving video data consistent with actual scenarios.” For example, six cameras positioned at different locations on the vehicle generate synthetic data from various perspectives within the same scene, adhering to spatial and temporal correspondences.
The generation process can alter various control conditions, such as adding vehicles coming from different directions; and allowing the model to generate driving videos under different weather and times. In the rainy day driving video showcased by Huawei, the rear driving lights are also turned on. “This indicates that the model has learned from massive video data that the driving lights should be turned on when driving in the rain.” These are all essential skills for autonomous driving + AI technology.
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
Furthermore, applying the Pangu model in industrial design can not only generate 3D car models that adapt to different environmental lighting but also output 3D files; in collaboration with South China University of Technology for architectural design, it can generate architectural overview videos based on design sketches;
And the AI-native dubbing capability can translate original film footage into different language versions—while preserving the original character’s tone, emotion, and intonation; even in video conferences, combining digital humans with AI dubbing allows video participants’ digital avatars to speak different languages, etc…
Readers interested in AI technology will be familiar with these cases, and we will not elaborate further.
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
Finally, let’s talk about robotic applications, specifically the “Pangu embodied intelligent model.” We have previously mentioned that robotics is a key to AI advancing to a higher stage, transitioning from the digital world to the real world. The Pangu embodied intelligent model “can enable robots to complete complex task planning of more than 10 steps and achieve generalization across multiple scenarios and multi-task processing”; at the same time, like Nvidia’s Isaac Sim, it allows robots to train in virtual worlds, “generating multimodal data required for robots to train and learn various complex scenarios.”
At the release event, a robot named “Kuafu” was demonstrated, capable of logical reasoning and task planning through multimodal world perception. For instance, it can pick up corresponding items from a table, understand what is on the table, how to pick it up, and interact with people through language and body language. “In addition to humanoid robots, the Pangu embodied intelligent model can also empower various forms of industrial and service robots.”
As AI transitions into the physical real world, AI empowering various industries including retail, logistics, healthcare, home services, and industrial production has become a business driving a trillion market value.
Having Models Alone is Not Enough: Strengthening Computational Infrastructure
From Huawei’s presentation, the industry application progress of the Pangu model looks promising. In addition to the aforementioned collaborations with the Beijing Railway Institute, there is already a line in Baosteel’s hot-rolling production line—used for engineering optimization and increased annual revenue, and discussions are ongoing about cost reduction and efficiency improvement in blast furnace scenarios;
Collaborating with the Shenzhen Meteorological Bureau, they created the AI regional report model “Zhijie 1.0” based on the Pangu model, providing kilometer-level weather forecasts; and partnered with Tianshili to enable the Pangu drug molecule model to learn from 3.5 million natural product molecules to discover effective components of traditional Chinese medicine and optimize prescriptions, etc…
However, just creating models is far from sufficient to build a robust ecosystem. Underneath the models, Huawei emphasizes the Ascend computing foundation and Ascend AI cloud services. This is the foundation for “accelerating model development and enabling diverse models”.
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
Currently, Huawei is constructing three large data centers in Gui’an, Ulaanqab, and Wuhu, “building cloud services that meet national AI computing needs.” Users “can obtain hundreds, thousands, or even more AI resources,” “having significant advantages in the long-term stability and fault recovery of training large models.” “Currently, it has adapted to over 100 mainstream models in the industry.” Zhang Pingan stated that cooperating enterprises have reached over 600.
In addition, Huawei Cloud CTO Zhang Yuxin indicated that Huawei Cloud is also deploying AI computing resource pools in hotspot areas such as North China, East China, and South China; “by utilizing Huawei Cloud CloudLake and CloudPond edge cloud platforms, AI computing is pushed closer to customers.” “This has created a cloud-network-edge-end collaborative AI native computing platform.” The so-called AI native requires the cloud architecture to genuinely serve AI, making it more efficient.
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
To achieve cloud AI native, Huawei Cloud has developed CloudMatrix: through high-speed interconnected network protocols, it pools all resources including CPUs, NPUs, and memory, achieving what is called “matrix computing,” which is a “distributed fully peer-to-peer architecture,” rather than the traditional master-slave architecture centered around CPUs. Zhang Yuxin emphasized that Huawei Cloud is currently the only cloud vendor that adopts “peer architecture super node technology.”
While the solutions were not detailed, Zhang Yuxin stated that “Huawei Cloud super nodes compared to industry single node computing improve by 50 times; E2E recovery time for large models is less than 10 minutes; linearity of ten thousand card clusters > 95%”; ultimately achieving a “68% improvement in model training efficiency under the same computing conditions compared to traditional cluster architectures.” These figures look quite impressive.
One of the key technologies to achieve these metrics is the EMS (Elastic Memory Service) to alleviate storage wall issues.
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
Specifically, it increases EMS elastic memory storage between NPU memory (in this article, “memory” refers to the storage resources of AI acceleration cards) and persistent memory. “Based on patented memory pooling technology, it breaks the memory wall through memory expansion, computing offloading, and using storage instead of computing.”
Regarding “memory expansion,” it refers to storing model parameters hierarchically in memory and EMS, “using less than half of the NPU card to store trillion-parameter large models, achieving over 50% savings in computing power”;
“Computing offloading” refers to offloading KV calculations (which should refer to key-value data structure calculations and caching) to EMS and CPUs, while model calculations still occur in memory and NPU, enhancing single card concurrency, “AI inference performance improves by 100%”; “using storage instead of computing” refers to saving historical KV calculation results in EMS for subsequent inference calls—”reducing the latency of the first token to less than 0.2 seconds.”
Three Technical Points of Pangu 5.0
Pangu Model 5.0 is developed on such a Huawei Cloud AI platform, thus it possesses several key “black technologies.” In addition to the capabilities mentioned earlier, achieving more modalities and stronger thinking, Yao Jun mainly introduced three technical points of Pangu 5.0 concerning data, parameters, and computing power.
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
Firstly, “evolving towards the scientific use of data.” “Synthetic data will occupy a place in larger scale model training, filling the gap of insufficient growth in high-quality natural data,” from the 3T token data volume of Pangu 3.0 to the 10T token data volume of Pangu 5.0, “of which the proportion of synthetic data exceeds 30%.”
“We are exploring data synthesis methods aimed at higher-order capabilities.” Specifically, “the weak2strong method, which uses weak models to assist strong models, iteratively synthesizes high-quality data.” The result is that the quality of synthetic data is “slightly better than real data” in various dimensions. Moreover, the weak2strong technology can enhance specific data, “for instance, long sequences and complex knowledge reasoning that are relatively scarce in natural data,” thereby strengthening the model’s specific capabilities.
Furthermore, during the “data course learning” process, a “ladder-style data curriculum” is adopted, first allowing the large model to learn basic courses, then gradually increasing the complexity of data, “learning knowledge from easy to difficult,” “achieving more controllable and predictable capability improvements.”
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
The second point is the “Ascend affinity” Transformer: π architecture. This optimization should be closely related to the Ascend chip. Yao Jun stated that the original Transformer model has “certain feature collapse issues,” “through theoretical analysis, it was found that the self-attention module in the Transformer exacerbates feature disappearance in the data.” The general industry approach is to add a residual connection to mitigate the feature collapse issue.
The π architecture “introduces nonlinear augmented residual connections,” “further increasing the features of different tokens, allowing the diversity of data features to be maintained in deep Transformers.”
At the same time, it enhances the processing efficiency of the FFN (Feedforward Network) module in the Transformer. “In the π architecture, we modified the activation function in the FFN to a series activation function, increasing the model’s nonlinearity. Although this increases the computational load of the FFN, it reduces the size of the self-attention module while maintaining accuracy, achieving Ascend affinity. On Ascend chips, inference speed is thus improved by 20-25%.”
Research results and content related to the π architecture have already been published at NeurIPS 2023; interested readers may want to check it out.
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
The third point is optimization for ultra-large-scale cluster training at the system level. Yao Jun mentioned that the bubble (waiting time) for training on a thousand-card cluster is generally around 10%, while for larger scale cluster training, the bubble may reach 30%. The multi-card communication process may also experience “a large number of routing conflicts,” resulting in an overall linear utilization rate of only about 80% for the cluster.
Huawei Cloud’s solution is “to break large computations and communications into small replicas; the system will automatically arrange the execution order of computations and communications between multiple replicas,” making small communications easier to hide within computations. Additionally, “there are key technologies for NP hard problem optimization, interleaving of forward and backward pipelines, etc.”
“We also optimized the scheduling and communication of large clusters. By arranging large flows to be under the same node or within the same cabinet-level routing, we reduce cross-router communications”; and “dynamically arranged the original ports to achieve completely zero conflict in cluster communications.”
These technologies are likely a concrete manifestation of Huawei’s previous technological accumulation in communications, also reflecting the fundamental idea that optimizing performance and efficiency in contemporary HPC and AI computing must start from the system level. It is said that through these methods, “over 70% of communications can be effectively hidden, reducing the bubble from 30% to 10%”; and MFU (Model FLOPs Utilization) also shows relatively ideal performance.
Pangu Model Also Solving Huawei’s Own Problems
In the discussion regarding AI native Huawei Cloud, part of the topic also involves using AI technology to enhance the efficiency of the cloud platform, including assistance in CodeArts, DataArts, MetaStudio, and even GaussDB.
For instance, CodeArts, as a software development production line, allows AI to read, write, debug, and test code—under the support of Pangu 5.0, CodeArts’ capabilities have “upgraded from function-level to project-level.”
Additionally, for GaussDB, the Gauss database, “we combined the product documentation, expert knowledge, operational experience, and other professional data with the large model to build the Pangu database model, achieving intelligent management of the full lifecycle of GaussDB development, testing, migration, and operation and maintenance.”
Developers can directly generate SQL scripts using natural language, supporting complex SQL generation such as multi-table association queries and stored procedures; it also supports converting SQL statements from other databases into GaussDB syntax… This part is not the focus of this article, so we will not elaborate further.
However, these are essentially manifestations of AI for Cloud, and even AI for Huawei’s full-stack technology, largely representing Huawei’s own practice as a user of Pangu 5.0; also, it is an application example of Pangu 5.0 in the field of computer science.
In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem
Among domestic AI infrastructure providers, Huawei is probably one of the few market participants that genuinely have the opportunity to compete with international giants at the ecosystem level, especially reflected in two key points: the Pangu model’s deep integration into various industries and continuous capability enhancement, as well as the Ascend servers and Huawei Cloud serving as the infrastructure driving the development of Huawei’s AI ecosystem, providing performance and efficiency optimizations for AI training and inference from a system level. It is said that Huawei’s AI framework currently accounts for over 20% of the market share, migrating and incubating over 50 mainstream large models, and incubating over 200 AI scenario solutions.
As we often refer to it as the turning point in the fourth industrial revolution, the advancement of the Pangu model and related hierarchical technologies and infrastructure in the AI track is, in our view, absolutely as important as the ecological development of the HarmonyOS. In the next article, we will discuss another protagonist of this HDC, HarmonyOS Next, especially the AI technologies integrated within it: Harmony Intelligence.

In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem

Recommended Hot Articles
  • Dissecting the Pinduoduo WiFi Signal Booster, Seeing the Board Made Me Cry…

  • Overview of LED Chip Principles

  • The 2024 Silicon 100 List is Out!

  • Next-Generation Power Switch Technology: B-TRAN

  • F15 Fighter Jet and Weapons Open Source!! A Total of 250G…

In-Depth Discussion on Pangu Model 5.0 and Huawei AI Ecosystem

Leave a Comment