What Kind of Large Models Does the Steel Industry Need?

Following the ChatGPT large model, domestic large models such as ChatGLM from Zhipu AI, Wenxin Yiyan from Baidu, and Xinghuo from iFlytek have emerged like mushrooms after rain, referred to as the “Hundred Model War.” Behind this phenomenon is not only a competition of technical strength among various companies but also a contest of their ability to implement application scenarios. What magic does the large model that has excited the market hold? How will the steel industry collide with large models? Recently, a reporter from China Metallurgical News interviewed Zhang Peng, CEO of Zhipu AI, to discuss the current “large model craze.”

Zhipu AI was established in 2019 through the transformation of technological achievements from the Computer Science Department of Tsinghua University, and began developing the GLM pre-training architecture the following year, making it one of the earliest institutions in China engaged in large model research. “In the face of the ‘large model craze,’ if I had to describe my thoughts in one word, it would be ‘faith.'” Zhang Peng pointed out, “‘Faith’ certainly means believing in this matter; the large model is undoubtedly a necessary path toward AGI (Artificial General Intelligence) and can create greater value; ‘faith’ is a solid and prudent attitude, based on understanding and research rather than blind enthusiasm.”

What Kind of Large Models Does the Steel Industry Need?

(Image from the internet)

From Theory to Practice, Why Are Large Models So “Hot”?

The origin of large models can be traced back to 2017, with the advent of the Transformer algorithm architecture, which initiated the historical evolution of large models. Although in the following years, BERT, GPT-1, and GPT-2 appeared one after another, and even BERT significantly surpassed traditional algorithms in more than a dozen natural language understanding tasks, it still did not spark much excitement in the industry until the arrival of 2020. “This year is the first year of large models,” Zhang Peng stated.

The emergence of GPT-3 greatly improved the model’s content generation and logical reasoning capabilities, demonstrating astonishing abilities in contextual learning and knowledge (common sense) understanding. This subsequently triggered a global surge in foundational model research, with institutions such as Meta, Microsoft, Google, as well as Tsinghua University, Beijing Academy of Artificial Intelligence, Baidu, Huawei, Alibaba, and Zhipu AI in China, all competing to propose multiple billion-dollar models, including Gopher, Chinchilla, PaLM, and GLM-130B.

However, the complex R&D technology and high training costs have deterred many. At that time, not everyone could clearly see the trajectory of technological development, and the large model required substantial financial investment, making rash investments quite risky. “At that time, we invited some professors from academia to discuss the future direction of technology evolution. Everyone believed this was a signal that large models had reached a critical point, and AI was truly entering a usable phase. However, we encountered many difficulties in finding computing power and model engineering issues, and it took us a long time to finally decide to go all in on large models and start developing our algorithm framework,” Zhang Peng said.

It wasn’t until the end of 2022, with the release of ChatGPT, that the “Hundred Model War” truly began. Unlike previous machine learning technologies, ChatGPT was no longer a tedious technical theory; it could repeatedly validate applications across various fields, allowing people to truly appreciate the charm of “intelligent emergence” from large models. ChatGPT achieved over 100 million global users in just two months, while it took 75 years for telephone users to reach the same milestone, 16 years for mobile phones, and 7 years for websites. Even TikTok, which previously had the fastest user growth, took 9 months.

With Many Voices and Diverse Flowers, What Should Be Noted About the “Large Model Craze”?

The release of ChatGPT has sparked more enthusiasm and determination for R&D among various institutions and companies, with numerous investors pouring into the blue ocean of large model development. Relevant departments have also noticed this important technological innovation, providing substantial policy support, further promoting the R&D and optimization of large models, forming a new pattern of technological development characterized by “a hundred schools of thought contending and a hundred flowers blooming.” This has allowed Zhipu AI, which has accumulated two years of technological strength, to step into the spotlight.

However, one must not be blind in the face of the “large model craze.” While the development and application of large models may boost industrial and economic development, if not reasonably controlled, it may also pose risks to industrial safety. On one hand, there is the “bottleneck” issue of chips. Computing power is one of the foundations of large models, and ensuring a continuous and stable supply of computing power is a critical issue for industrial safety. On the other hand, whether the foundational models used in the industry are safe and controllable is also a significant concern. Is the training data for the models safe and compliant? Are the models independently controllable? Will they encounter various restrictions like “chip imports”? These are all crucial issues affecting the long-term development of the industry.

In the face of risks and challenges, Zhang Peng pointed out that as a startup, developing large language models requires great determination. Besides the challenges at the research level, there are also a series of resource investments, teams, training data, and other aspects involved in model training engineering. Regarding the chip issue, Zhipu AI established a domestic hardware adaptation plan at the beginning of its R&D and has now collaborated with more than ten domestic chip manufacturers, hoping to enhance the training and inference efficiency of models on domestic hardware while achieving comprehensive adaptation. Additionally, Zhipu AI has chosen to start self-developing from the underlying algorithms to achieve the goal of having safe and controllable foundational models.

“There is still a gap between domestic large models and foreign large models, but we are confident in bridging this gap, and we are continuously innovating,” Zhang Peng stated.

How to Amplify Application Value with Traditional Industries and Large Models?

Currently, with the rapid development of artificial intelligence, the application of large models has gradually expanded from the research field to industrial practice, forming industrial large models. From “general” to “application,” large models are knocking on the door of industrial manufacturing.

From the perspective of R&D difficulty, whether it is general large models or industrial large models, R&D investment, core talent, and application scenarios are indispensable and constitute the core barriers of the market. Moreover, industrial large models have extremely high requirements for the effectiveness of algorithm models, high-quality data, and computing power support capabilities, and the optimization and iteration of models also rely on continuous investment in funds and talent. Therefore, the actual landing and industry application capabilities of large models have become important standards for market verification.

“The commercial application of industrial large models is still in need of exploration.” Zhang Peng believes that first, industrial large models need to be further integrated with other digital products to meet the integrated requirements of industrial enterprises for networks, computing power, and data management, achieving immediate usability. Second, the usage threshold for industrial enterprises remains high, as it requires application development based on prompts, integrating long-term logic and related cases to enable large models to generate answers according to preset steps, thought processes, and response formats. Third, there are already numerous industrial software and industrial internet platforms in various fields; how to utilize large models to form a collaborative ecosystem will profoundly impact user perception and product vitality. Allowing and encouraging third-party developers to create plugins based on industrial large models is an important pathway, similar to how OpenAI is accelerating the construction of its ecosystem based on ChatGPT + plugins.

Large language models will reshape the business landscape across various industries, but more resources need to be invested in specific industry applications. Any technology that develops to a certain extent will inevitably generate more practical value. A common saying about realizing value is that a universal foundational model is not always necessary; what is needed are smaller, medium-sized, and suitable industry models. However, the fundamental reason for the breakthrough capabilities of large language models lies in their learning and modeling of world knowledge, enabling them to possess near-human understanding, reasoning, and advanced cognitive abilities. Zhang Peng stated that the ideal situation is that industry models are not completely independent of foundational models and general models but grow on top of foundational models, undergoing further training and fine-tuning based on them.

Currently, China’s traditional industries are facing an intelligent transformation, and incorporating industry-specific data and knowledge, along with accurately matching real application scenarios, can significantly enhance efficiency and levels of business processes, driving industrial transformation and upgrading. The steel industry, characterized by continuous production processes, complex technological systems, diverse intermediate products, centralized large high-temperature and high-pressure equipment, and high personnel safety requirements, is a typical process manufacturing industry facing severe challenges in resources, markets, environmental protection, and competition. “The steel industry urgently needs to enhance its green environmental protection, safety assurance levels, and production efficiency through advanced technologies such as large models and scenario-based innovative applications,” Zhang Peng said.

Regarding how large models can be applied in the steel industry, Zhang Peng stated that an artificial intelligence solution for the steel industry can be developed, using a large AI model with general foundational capabilities as the intelligent base, combined with industry knowledge and scenario data for training and fine-tuning, effectively addressing fragmented and diverse needs while significantly reducing manpower, time, and cost inputs during the engineering processes of R&D, customization, deployment, and tuning. This will also help resolve data security issues and promote the large-scale application of artificial intelligence in the steel industry, facilitating the intelligent upgrading of the steel industry.

What Kind of Large Models Does the Steel Industry Need?

—Advertisement—

What Kind of Large Models Does the Steel Industry Need?

Author | Reporter Fan Sancai

Editor | Lü Lin

More Exciting Content

What Kind of Large Models Does the Steel Industry Need?

Recent Major Personnel Changes in Steel Enterprises

What Kind of Large Models Does the Steel Industry Need?

Clarifying the Name! Deputy Minister of Industry and Information Technology: High-energy-consumption industries like steel are neither backward nor sunset industries!

What Kind of Large Models Does the Steel Industry Need?Liu Bingjiang: Vigorously Promote Short Processes, Strictly Prohibit “Ultra-low”

What Kind of Large Models Does the Steel Industry Need?

Making Industrial Policies Better Serve Industrial Development—Research Report on the Production Status of Small Iron Ore Mining Enterprises in Hebei Province

What Kind of Large Models Does the Steel Industry Need?

“Xining Special Steel Bankruptcy Restructuring! 10 Intentional Restructuring Investors Have Signed Up”

What Kind of Large Models Does the Steel Industry Need?

The World’s First! China First Heavy Industry Again Reigns Supreme!

Leave a Comment