Future Directions of Large Models by Academician Zhang Bo

Recently, Academician Zhang Bo of the Chinese Academy of Sciences and Honorary Dean of the Institute for Artificial Intelligence at Tsinghua University stated in his speech at the 12th Internet Security Conference ISC.AI 2024 that current artificial intelligence lacks a theory, only developed models and algorithms targeted at specific fields. Both software and hardware are specialized, leading to a very small market. Therefore, up until now, a large-scale artificial intelligence industry has not yet developed, and this is where the problem lies.

Future Directions of Large Models by Academician Zhang Bo

Academician Zhang Bo of the Chinese Academy of Sciences and Honorary Dean of the Institute for Artificial Intelligence at Tsinghua University

At 89 years old, Academician Zhang Bo has trained a group of artificial intelligence talents at Tsinghua University over the past few decades and is one of the founders of the artificial intelligence discipline in China. Many popular “Tsinghua-based” large model companies, such as Shengshu Technology, Zhipu AI, Mianbi Intelligence, and Kimi, have all benefited from the technological foundation laid at Tsinghua, with core technical talents either directly or indirectly mentored by Zhang Bo.

In this speech, Academician Zhang not only pointed out the defects and problems existing in current artificial intelligence technology but also provided directions for future improvements.

When considering foundational models,

we must consider 3 major capabilities and 1 major flaw

According to Academician Zhang, due to theoretical limitations, the previous stage of the artificial intelligence industry must develop in conjunction with specific application fields. Therefore, the artificial intelligence developed during this stage is specialized, referred to as “weak” artificial intelligence. However, he also pointed out that the current foundational models have achieved generality in language issues. “When considering foundational models, we need to focus on 3 major capabilities and 1 major flaw; this is very important and serves as the starting point for our consideration of future industry development.”

He explained that the strength of large language models lies in their powerful language generation capability, strong human-computer natural interaction ability, and strong ability to draw inferences from one instance to another. “The language generation of large language models belongs to an open domain, capable of generating diverse results that all outputs can be understood by humans. Even when they are ‘talking nonsense,’ we can still understand what nonsense they are talking about, which is very important. The natural language dialogue between humans and machines in an open domain was previously thought to require generations of effort to achieve, but unexpectedly, this goal was reached in 2020.”

Academician Zhang stated that the flaw of large models is ‘hallucination’. “Because we require diverse outputs, it inevitably produces errors. These errors are very different from the errors machines typically produce; machine errors are often controllable. This error is inherent and will definitely occur, and we cannot control it. Therefore, this is also an issue we need to consider when thinking about its applications in the future.”

Combining the 3 major capabilities and 1 major flaw, Academician Zhang summarized the current suitable application scenarios for large models: a high tolerance for errors is required. He indicated that from an industrial perspective, the application of large models presents a “U” shape—planning and design at the front require content diversity, while services and recommendations at the back also require diversity, with a high tolerance for errors. However, the middle part needs to be considered based on the situation for its use.

Despite the existing problems, Academician Zhang still stated that “models must be used,” because once a model base is established, the efficiency and quality of applications will certainly improve. In the past, we developed software to provide services on an empty computer, which is akin to a literate person. Now, with large models, the platform is at least a high school student, and development efficiency will certainly increase; this is the direction for the future.”

Academician Zhang focused on analyzing the root cause of hallucinations, believing that the fundamental limitation of models lies in the fact that all work done by machines is externally driven. Humans teach them how to do it, rather than them acting autonomously. Additionally, the results they generate are heavily influenced by prompts, which is a significant difference from the way humans complete tasks under internal intent control.

Four Future Directions for Large Models:

Alignment, Multimodal, Agents, Embodied Intelligence

Academician Zhang introduced four future directions for large models, which are very important for improving large models.

The first is alignment with humans, “Large models do not have the ability to judge right from wrong, cannot self-update, and are all driven by humans for updates. Without breaking through this point, machines cannot self-evolve. Large models require external prompts; therefore, correcting the errors of large models under human guidance is our first task.”

The second is multimodal generation, “Multimodal generation will be very important for industrial development in the future. Although we see that large models mainly generate text, if we use the same method to generate images, sounds, videos, and code, the level of generation will be close to that of humans. The reason we can generate images so well now is primarily due to linking images with text. Therefore, the most essential breakthrough is in text processing.”

The third is the concept of AI Agents, “We need to combine large models with surrounding virtual environments, allowing the environment to prompt them about their errors, because we only know right from wrong after performing a task. Therefore, the concept of agents is very important, allowing the environment to prompt the agents and giving them opportunities for reflection to correct errors.”

The fourth is embodied intelligence, “By adding robots, large models can work in the physical world. In the future, how to develop general-purpose robots? I believe it should be ‘software general, hardware diverse.’ Although Musk promotes humanoid robots, I believe the future is not limited to humanoid robots.”

According to Academician Zhang, to develop the third generation of artificial intelligence, a theory must first be established. The existence of large models cannot be explained by theory, which leads to various confusions and misunderstandings. As machine development scales up, the inability of theory to explain it will cause panic. Achieving safe, controllable, trustworthy, reliable, and scalable artificial intelligence technology will always have safety issues until this field is fully developed.

Please click ‘Read Original’ to submit

Future Directions of Large Models by Academician Zhang Bo

Editor: Gao Jie
Responsible Editor: Duan Shaomin

Review: Li Guoqing

Source: Xinjingbao Beike Finance. If there are any copyright issues, please contact us in a timely manner. The copyright interpretation rights belong to the original author. This article is recommended for reading by Intelligent Manufacturing IMS!

Future Directions of Large Models by Academician Zhang Bo

Future Directions of Large Models by Academician Zhang Bo

Pleasefollowthe video account
to learn more about intelligent manufacturing news

Leave a Comment