Sora Emerges: Insights into the Three Pillars of Artificial Intelligence

Artificial Intelligence (AI) is an important driving force behind the new wave of technological revolution and industrial transformation, with data, algorithms, and computing power recognized as the three core elements of artificial intelligence.

With the arrival of the “Hundred Models War” featuring ChatGPT, Sora, and others, artificial intelligence is increasingly becoming a tangible part of people’s lives. It is evident that artificial intelligence has become an important battleground in international technological competition.

China is accelerating its layout in the field of artificial intelligence. Whether it can effectively harness these three core elements is crucial for the country’s ability to develop its artificial intelligence industry and seize the opportunities presented by the new wave of technological revolution and industrial transformation. In this regard, the Theory Edition of the People’s Daily has launched a special plan to explore the three pillars of artificial intelligence and envision the prospects and trends of China’s AI development.

Data: Co-composing a Smart Future

□ Lin Zhijie, Luo Qinfang

As a key new production factor and innovation element in the digital economy era, data has permeated various aspects such as production, consumption, circulation, distribution, and social service management, serving as an important guarantee and driving force for the development of artificial intelligence. Recently, the National Data Bureau and 17 other departments issued the “Three-Year Action Plan for Data Elements (2024-2026)”, aimed at fully leveraging the multiplier effect of data elements to empower economic and social development.

Data is the Cornerstone of AI Development

The data element is the foundation of digitalization, networking, and intelligence, and is the cornerstone of artificial intelligence development. In recent years, the rapid development of big data-related technologies, products, applications, and standards has provided a new way of thinking, exploration methods, and decision-making paradigms for human recognition of complex systems, further promoting innovation in artificial intelligence.

First, the development of artificial intelligence relies on the supply of high-quality data and the labeling and learning from massive amounts of data. Big data provides a vast sample space for AI algorithms such as deep learning, and the large-scale application of artificial intelligence in different scenarios requires labeling, learning, and training based on massive data to discover patterns, acquire information, and make decisions.

Second, data provides a wider and more diverse source of information for the development of artificial intelligence. On one hand, the development of digital technology has made data diversity a norm, presenting characteristics of multi-source heterogeneity and rich media; on the other hand, in a big data environment, the information supporting artificial intelligence for management decisions expands from internal domains to cross-domain environments. Structured, semi-structured, and unstructured cross-domain data allow artificial intelligence technologies and applications to acquire multi-modal information from multiple perspectives, enhancing their cognitive ability towards a complex world.

Finally, data is not only the raw material for artificial intelligence models but also provides impetus for the continuous innovation of models, playing the role of an innovation element. On one hand, based on multi-source heterogeneous big data, model developers and users can continuously iterate and optimize existing models and innovate algorithm models; on the other hand, rich data sets help improve the generalization ability of models, enabling effective learning and recognition when facing new information.

The Current State of Big Data Development in China

Since the 18th National Congress of the Communist Party, under the strong leadership of the Party Central Committee with Comrade Xi Jinping at its core, China has placed great emphasis on the development of big data and other digital technologies. Through the collective efforts of society, significant progress has been made in big data-related technologies and practical applications.

The scale of digital infrastructure has significantly increased. According to the latest data, by the end of 2023, China has built a total of 3.377 million 5G base stations. By the end of 2022, the number of mobile IoT terminal users in China reached 1.845 billion, making it the first major economy in the world to achieve “more devices than people”; the total scale of active data center racks exceeded 6.5 million standard racks, with an average annual growth rate of over 30% in the past five years; the total computing power of active data centers exceeded 180 EFLOPS, ranking second in the world.

The scale of data resources continues to grow. In 2022, the national data output reached 8.1 ZB, accounting for 10.5% of the global total, ranking second in the world; the cumulative data storage reached 724.5 EB, a year-on-year increase of 21.2%; the national integrated government data sharing hub published various data resources totaling 15,000 categories, supporting over 500 billion shared calls cumulatively. It is estimated that by 2025, the global data scale will reach 175 ZB, with China’s data scale reaching 48.6 ZB, becoming the world’s largest data circle.

The data trading market is rapidly increasing. In 2022, China’s data trading industry market scale reached 87.68 billion yuan, a year-on-year increase of 42%, accounting for 13.4% of the global data market trading scale and 66.5% of the Asian data market trading scale. It is expected that by 2025, the market scale of China’s data trading industry will reach 204.6 billion yuan, and by 2030, it is expected to reach 515.59 billion yuan.

Data Empowerment Issues and Countermeasures for AI

During the 14th Five-Year Plan period, China’s digital economy is transitioning to a new stage of deepened application, standardized development, and inclusive sharing. In recent years, artificial intelligence and big data technologies and industrial systems have matured, yet challenges and problems remain in the process of data empowering artificial intelligence, requiring urgent breakthroughs.

First, the quality of data supply is not high. On one hand, although China has abundant data resources, the actual amount of data that is truly open, shared, and effectively utilized remains relatively low. The current situation of data “only producing but not flowing out” has led to the emergence of many “data islands”, restricting the high-quality supply of data; on the other hand, while data resources continue to accumulate, the density of valuable data is also decreasing, affecting the high-quality supply of data.

To accelerate the concentration and sharing of data in the public service sector, it is necessary to promote the integration of social data accumulated with enterprises on platforms, optimizing the data supply structure; to adhere to the principle that public data is taken from the people and used for the people, accelerating the classification and graded authorization of public data usage, breaking the “data islands”, strengthening the high-quality supply of data elements, and playing a foundational, leading, and demonstrative role for public data in the development and utilization of data elements; to accelerate the exploration and establishment of a standardized system for data quality, promoting the adjustment and optimization of data element supply, and improving the quantity and quality of data element supply.

Second, there are difficulties in defining data property rights. As a new type of production factor, data has characteristics of intangibility, non-consumability, and ease of replication, posing new challenges to traditional property rights systems. In the processes of data production, circulation, and usage, different entities have different interests in data, which exhibit complex symbiosis, interdependence, and dynamic changes, making it difficult for traditional rights systems to break through the dilemmas of data property rights.

To address the practical problems encountered by market entities, it is necessary to establish legal regulations for data property rights management, refine the “three rights separation” framework of data resource ownership, data processing and usage rights, and data product operation rights, innovate the concept of data property rights, downplay data ownership, emphasize data usage rights, and accelerate the establishment of a complete data property rights system to fully release the value of data elements.

Third, the mechanisms for data circulation and trading are not smooth. First, the current data trading lacks a unified pricing and evaluation mechanism, relying on point-to-point individual transactions, resulting in low transparency in data circulation due to information asymmetry; second, for different industries, organizations, and devices, it is difficult to unify data standards and interfaces, and the operability of data circulation integration is weak; finally, although China has gradually formed a data protection system with Chinese characteristics, more clear and targeted policies and regulations are still lacking.

To accelerate the overall construction of data trading venues, it is necessary to combine centralized trading within the venue and decentralized trading outside the venue, forming a multi-level and diversified market trading system; to promote the standardization of data collection and interfaces, strengthen the technology for interconnecting heterogeneous data, and provide more reliable technical support for data circulation between different entities; to focus on business needs, encourage companies within the industry and local areas to explore innovative models based on specific scene requirements, and formulate more detailed data circulation rules and standards to promote data integration, interconnection, and interoperability, driving the application of artificial intelligence across various industries and fields.

Fourth, the data governance system needs further improvement. First, the fragmented industry and traditional territorial governance models are insufficient to meet the governance needs of data elements circulating and trading across regions, industries, and levels; second, the generation and use of data often involve multiple entities (e.g., buyers, sellers, platforms), complicating the confirmation of data governance responsibilities and processes; third, the risks of data security and privacy protection become more prominent as the scale of data increases and artificial intelligence technology develops; finally, the massive data volume and diverse data types pose higher requirements on the technologies supporting data governance.

To strengthen forward-looking data governance layout, it is necessary to view the strategic height of becoming a strong data nation, gradually improving top-level policy design, breaking regional, industry, and hierarchical barriers; to guide grassroots governments, markets, social organizations, and the public to achieve joint governance of data elements through interaction, consultation, and cooperation, constructing a governance model that coordinates government, enterprises, and society; to implement an overall national security concept, accelerating the improvement of data classification and grading, important data protection, risk assessment, emergency management, and developing a robust data security industry to provide strong support for national data security; encouraging innovation among various entities including industry, academia, and research to accelerate the advancement of core technologies related to data trusted circulation and security assurance, promoting the innovation of digital technologies related to data governance such as privacy computing, quantum computing, and blockchain.

(The authors are tenured associate professors at Tsinghua University School of Economics and Management, and postdoctoral researchers at Tsinghua University School of Economics and Management)

Algorithms: Challenges and Governance

□ Cao Yixuan

In today’s rapidly developing artificial intelligence landscape, algorithms have become an indispensable part of our lives and work. Whether it is personalized recommendations on social media, product suggestions on e-commerce platforms, or smart home and autonomous driving “smart scenarios”, algorithms are ubiquitous, providing invisible assistance to our thinking and decision-making, greatly enhancing efficiency while also bringing challenges.

The Connotation and Evolution of Algorithms

An algorithm is a series of explicit instructions used to solve problems, enabling complex issues to be resolved through computational methods. Currently, artificial intelligence algorithms mainly refer to a class of algorithms that can induce summaries from data, learn patterns, and subsequently make predictions or decisions.

Data, algorithms, and computing power are the three pillars of artificial intelligence, akin to the three planks surrounding the intelligence level of artificial intelligence. To illustrate, if the development of artificial intelligence is viewed as the intellectual growth of a person, data, algorithms, and computing power correspond to books, learning methods, and learning duration, respectively. Only with high-quality and sufficient books, appropriate learning methods, and ample learning time can one achieve a higher intellectual level. Algorithms are a major part of artificial intelligence research, designed to teach machines how to learn.

Whether you realize it or not, algorithms have already permeated all aspects of life. For instance, short video platforms utilize algorithms to recommend content that users may be interested in by analyzing their historical behavior, financial institutions use algorithms for risk assessment and credit scoring, and autonomous vehicles rely on algorithms to process and analyze road conditions, as well as facial recognition for clocking in, facial recognition payments, object recognition, automatic translation, and map navigation… In these commonplace scenarios, the number of tasks involving algorithms is countless. It can be said that algorithms are reshaping our understanding of the world.

Since the concept of artificial intelligence was proposed in 1956, it has undergone several significant evolutions. Since 2015, artificial intelligence research has rapidly advanced towards deep learning centered around neural networks, supporting most current intelligent application scenarios. Since 2022, marked by the release of ChatGPT by OpenAI, artificial intelligence has entered the era of large models. The key here is the efficient utilization of massive amounts of unlabeled data through algorithm design, achieving a certain degree of general intelligence through large models and large computing power. Currently, algorithm technology is in a phase of rapid development. In international comparisons, China’s algorithm research and application are rapidly developing and keeping pace with or leading in certain fields. However, in the era of large models, domestic chip technology is limited, and computing power may become a bottleneck for current domestic algorithm development.

The value of algorithms goes far beyond mere technological advancement; they provide us with a new perspective for understanding and processing information, thereby improving decision-making processes. In the medical field, algorithms can predict disease risks by analyzing patients’ historical data, providing doctors with treatment recommendations; in manufacturing, algorithms optimize production processes, enhancing efficiency and quality, thereby reducing costs and increasing output; in finance, algorithms are used not only for credit scoring but also for pattern recognition in stock market trading, helping investors make more informed decisions; algorithms exhibit enormous application potential in social governance, scientific research, security, and other fields. In the future, artificial intelligence will be an important area of competition among major powers, and as technology develops, we will see more countries investing in artificial intelligence technology to gain a competitive advantage globally. This is not just a technological race; it is a strategic layout concerning the future direction of the economy, politics, and society. Therefore, the development, application, and governance of algorithms are becoming increasingly important in today’s intensifying technological competition.

Challenges and Governance of Algorithms

Although algorithms bring us many conveniences, they also create certain social issues. The information cocoon effect is a typical example. As mentioned earlier, short video and news platforms recommend content that users may be interested in based on their viewing history. While this facilitates users in obtaining information, it also reduces their desire to actively seek information and find new perspectives. Users continuously watch these recommended contents, which stimulates algorithms to continue recommending similar content. This cycle ultimately results in users viewing homogeneous content and similar viewpoints, invisibly depriving them of the opportunity to understand a broader world and amplifying personal biases, trapping individuals in their own information cocoons and making it difficult to comprehensively recognize the external world. On a macro level, the information cocoon can reduce empathy and sense of identity between people, exacerbating social divisions. Another issue is “big data killing the familiar,” where algorithms, based on users’ shopping history, know that they can bear higher prices and may offer discriminatory pricing. In tourism, ride-hailing, and other fields, there have been reports related to “big data killing the familiar” in recent years. Additionally, research abroad has found that algorithms trained on real data may exhibit racial biases and injustices, including different predictions of criminal tendencies for different races and varying accuracy in facial recognition across different races. Such issues are not yet sufficiently studied in China but have already attracted the attention of relevant authorities.

In facing the risks posed by algorithms, we must neither avoid them nor turn a blind eye. The public, algorithm providers, and the government should work together to form a joint effort to address the challenges posed by algorithms. For the public, it is essential to understand basic algorithm concepts, clarify their needs and relationship with algorithms, and utilize algorithms reasonably to serve themselves. For algorithm providers, it is necessary to adopt a long-term perspective, enhance social responsibility, and position socially responsible algorithms as a higher-level corporate value concept. For the government, it is essential to establish corresponding ethical guidelines and legal regulations, requiring enterprises to enhance transparency in algorithms, clearly establish regulatory red lines, and develop safety testing methods to ensure orderly market development. Of course, the current regulatory system is insufficient to fully control the rapid iteration of algorithm technology, so how to reasonably formulate algorithm regulatory norms that neither lag behind technological developments nor ensure public interests presents a new challenge for regulatory authorities. Currently, China has introduced regulations such as the “Regulations on Algorithm Recommendation Management for Internet Information Services” and the “Interim Measures for the Management of Generative Artificial Intelligence Services” to strengthen algorithm regulation, and in the future, it will be necessary to continuously adjust and improve regulatory methods based on new technological developments.

In summary, looking ahead, we have reason to believe that algorithms will bring more surprises to the world. At the same time, only by establishing ethical and value guidance for algorithms can they develop healthily on a reasonable and fair track, truly benefiting human society. This requires the joint efforts of the government, enterprises, and the public.

(The author is an associate researcher at the Institute of Computing Technology, Chinese Academy of Sciences)

Computing Power: Overcoming AI “Computing Power Anxiety”

□ Zhang Peipei

The competition in the era of artificial intelligence is not only a competition of algorithms and applications but also a competition of computing power infrastructure. In the context of the Internet of Everything, the explosive growth of data volume has led to an unprecedented demand for computing power. In February 2023, the Central Committee of the Communist Party of China and the State Council issued the “Overall Layout Plan for Digital China Construction,” emphasizing the need to “smooth the main arteries of digital infrastructure and systematically optimize the layout of computing power infrastructure”; in October 2023, the Ministry of Industry and Information Technology and six other departments issued the “Action Plan for High-Quality Development of Computing Power Infrastructure.” These series of policies indicate that the digital economy has entered a new stage supported by computing power, and it is imperative to accelerate the establishment of new competitive advantages in computing power.

Computing Power as a New Productive Force

In simple terms, computing power refers to the ability to transmit, store, and process information data. Each technological revolution is accompanied by leapfrog developments in production factors and productivity. As we enter the digital economy era, data has become a new productive material, and computing power has also become a new productive force. In the past, electricity generation was regarded as a hard indicator of modern economies, a standard for measuring a country’s economic development and civilization level. Today, computing power, representing digital information processing capabilities, has become a new driving force supporting the deep development of the digital economy and an important indicator for measuring a country’s overall strength and level of economic and social development.

Today, the operation and development of digital society rely on strong computing power support, from cities to households, from governments to enterprises, computing power has become an important driving force for social development. Important innovation tracks such as cloud computing, blockchain, artificial intelligence, and quantum computing all depend on the impetus of computing power. The increasingly mature new energy electric vehicles are no longer just a type of power device; they have become information products driven by computing power. In the future, as we enter the intelligent era, more intelligent products will emerge, and a vast number of terminal devices will generate massive amounts of data. Every production and living scenario will rely on computing power to process information, making computing power ubiquitous, and computing power services will become a basic societal infrastructure akin to water and electricity.

“Computing Power Anxiety” in the Era of AI

In the past few decades, propelled by Moore’s Law, chip computing power has rapidly advanced at a rate of doubling every 18 months. Nevertheless, in recent years, due to continuous breakthroughs in artificial intelligence technology and its wider adoption, the demand for computing power has surged explosively, while the supply of computing power is difficult to keep pace with this demand, resulting in “computing power anxiety.”

The fundamental reason for the consumption of computing power in artificial intelligence development is that it has changed the basic paradigm of problem-solving. Computers cannot autonomously generate knowledge, but they can obtain statistical patterns behind data through extensive calculations. Machines learn from existing data and knowledge, continuously correcting models through a large number of training samples and utilizing models to make decisions or predictions in similar scenarios. Tools like ChatGPT are based on big data and large computing volume. In other words, as long as computing power is sufficiently strong, many problems can be transformed into computational issues. Many problems previously thought to be difficult for machines to solve, such as autonomous driving, voice and image recognition, and content creation, can now be solved by artificial intelligence relying on strong computing power.

According to a report released by OpenAI, since 2012, the computing power used for AI training tasks has doubled every 3-4 months; from 2012 to 2018, the demand for AI computing power grew by 300,000 times. Further research predicts that from 2018 to 2030, the demand for computing power in intelligent transportation will increase by 390 times, and the demand for computing power in smart factories will increase by 110 times. Today, the rapid development of artificial intelligence requires sufficient computing power resources as energy supply for each technological breakthrough, and the rapid iteration of large models has led to a surge in computing power demand, making it a significant challenge for the development of artificial intelligence.

Accelerating the Construction of New Competitive Advantages in National Computing Power

In the future, a country’s computing power security will be as important as food security and energy security. Since the construction of computing power requires continuous investment of enormous human, material, and financial resources, the competition in computing power largely unfolds between countries and among capital technology giants, making it a focal point in the great power game. According to data from the Ministry of Industry and Information Technology, as of the end of 2022, China boasts a total computing power of 180 EFLOPS, with the core computing power industry scale reaching 1.8 trillion yuan, maintaining a stable second position globally. However, China’s computing power industry also faces many deficiencies, such as hardware bottlenecks. In light of the new situation, we must accelerate the construction of new competitive advantages in national computing power.

First, we must accelerate the construction of high-level computing power infrastructure. Computing power infrastructure is key to steadily enhancing computing power, truly serving all sectors of the economy and society, and integrating computing power into every household. This requires innovation in computing power supply models. In 2020, the state clearly proposed that the scope of new infrastructure construction includes computing power infrastructure represented by data centers and intelligent computing centers, making the infrastructure of computing power supply a trend. Intelligent computing centers will become the main production and supply centers for computing power in the future, accelerating the construction of high-quality and high-efficiency intelligent computing centers. The core of computing power is servers, and the core of servers is chips. Chips are undoubtedly the biggest shortcoming of China’s computing power industry. In recent years, the domestic chip industry has seen continuous investment in funds and talent, along with policy support. Domestic high-tech companies have also conducted in-depth exploration of core chip technologies and have gradually broken external constraints, with greater breakthroughs expected for Chinese chips in the future.

Second, we must implement the “East Data West Computing” project to build a national computing power network system. Just as power development cannot be separated from the power grid, computing power development cannot be separated from a “computing power network.” To allow users to enjoy computing power services anytime and anywhere, forming a new type of national infrastructure akin to water and electricity, building a larger-scale computing power network has become a feasible path under exploration. In May 2021, the National Development and Reform Commission and four other departments jointly released the “Implementation Plan for the National Integrated Big Data Center Collaborative Innovation System’s Computing Power Hub,” proposing the construction of national computing power network hub nodes and launching the “East Data West Computing” project to build a national computing power network system. The eastern region of China has high computing power demand, while the western region has cheap energy and low computing power costs. Sending eastern data to the west for storage and computation not only achieves national unified resource allocation and better cost-performance ratio for computing power but also guides the intensive development of data centers in the east and leapfrog development of data centers in the west, improving the imbalance in digital infrastructure and promoting the collaborative interaction of computing power, networks, data, and energy. By building a national integrated computing power network system, we can feed back into the domestic computing power industrial chain, drawing on historical experiences of high-speed rail and communication networks catching up and overtaking from a disadvantageous position through unified scheduling and intensive R&D to form advantage breakthroughs.

Third, we must build a commercially sustainable computing power ecosystem that benefits market entities. To construct new competitive advantages in national computing power, the key is to truly utilize computing power and form a strong user cluster, entering a positive technological and commercial cycle. Therefore, it is necessary to provide sufficient development capabilities in computing power infrastructure, driven by scenarios and guided by applications, opening up more typical scenarios to promote the industrial application of computing power, creating new businesses, new models, and new formats for computing power; promoting computing power vendors to form mature business models and stable profit returns, entering a positive cycle of commercial-driven technology, and increasing R&D investment to gain greater technological advantages; in key areas of application, creating application benchmarks to enhance the penetration rate of computing power in industries such as manufacturing and finance, enabling large-scale replication and promotion of applications in fields such as healthcare and transportation, further expanding the application scope in energy, education, and other areas, constructing a trustworthy technology, shareable resources, commercially sustainable, and beneficial computing power ecosystem, promoting the overall rise of the computing power industry.

(Author’s unit: Shandong Academy of Social Sciences Institute of Philosophy)

Focusing on practical issues, keeping up with current hot topics, conducting theoretical interpretations, and making analytical studies will be published on Tuesdays.

Submission email: [email protected], contact number: 0531-85193121

(People’s Daily reporters Zhang Hao and Cui Kaiming compiled the report, planned by Ren Yubo and Shao Fangchao)

Leave a Comment Cancel reply