Unexpected Results of Technological Evolution: How Games and Cryptocurrency Became AI’s “Computing Power Base”?

In the recently passed spring, we witnessed the largest tech carnival of the new century. Describing the development of artificial intelligence (AI) in the past few months as “springing up like bamboo shoots after a rain” would be too conservative; “big bang” might be a more appropriate description — even Lu Qi, the former president of Baidu, an industry giant, and widely recognized as the “most driven” individual, said he couldn’t keep up with the (papers and codes), as there was just too much.

Looking back to November 30, 2022, the door to a new era suddenly opened. OpenAI released ChatGPT, and people realized that AI had once again replicated the brilliance of AlphaGo — and this time more comprehensively. Generative AI, represented by GPT-3, seemed to possess comprehensive language processing capabilities, while MidJourney and Stable Diffusion made painting no longer an exclusive human craft. In the following months, large language models (LLMs, Large Language Models) became well-known keywords, and internet giants like Microsoft, Google, and Facebook (Meta) returned to the spotlight.

Domestic companies also rushed to make their mark. Baidu’s “Wenxin Yiyan”, SenseTime’s “Riri Xin”, Alibaba’s “Tongyi”, Tencent’s “Hunyuan”, and Huawei’s “Pangu” all made their appearances. By May, more than 30 large models had been released by various enterprises and educational research institutions, showcasing a grand ambition of “building a new IT foundation for the era”, truly deserving the title of “daily industrial revolution, nightly cultural renaissance”.

Unexpected Results of Technological Evolution: How Games and Cryptocurrency Became AI's "Computing Power Base"?

Image copyright, no unauthorized reproduction

Of course, the future of AI is not without concerns. An article by Bloomberg in early March 2023 stated that 10% to 15% of Google’s annual electricity consumption is consumed by AI projects. This is roughly equivalent to the annual electricity consumption of 500,000 people in Atlanta. According to predictions by the International Data Corporation (IDC), AI’s energy consumption currently accounts for about 3% of global energy consumption; by 2025, this figure will soar to 15%, bringing with it a huge impact on the environment.

In this sense, energy is the first foundation of AI. Perhaps before AI benefits all of humanity, it will first collide with the wall of energy.

01

How is this energy consumed by AI?

But why does AI consume so much electricity? This involves itsanother foundation:computing power. AI is a computation-intensive technology, especially in applications like ChatGPT. It requires a large amount of computing power, which naturally also requires a lot of energy.

The recent AI wave has been driven by deep learning (Deep Learning) technology, which constructs artificial neural networks divided into multiple layers (i.e., deep neural networks), where each neuron has its own adjustable parameters. Large language models often mean billions, tens of billions, or even more parameters, which is a guarantee for achieving good results; on this basis, a massive dataset is also needed to teach the model how to respond correctly. What supports both is powerful computing capability.

Computing power, data, and algorithms are the three essential elements of AI, none of which can be lacking.At its launch, ChatGPT was based on the GPT-3 model. This model contains 175 billion parameters and used 45T of data for training, with a training computing power requirement of approximately 3640 PF-days — that is, if using a computing device that performs 10 trillion operations per second, it would take 3640 days to complete a training.

Unexpected Results of Technological Evolution: How Games and Cryptocurrency Became AI's "Computing Power Base"?

Image copyright, no unauthorized reproduction

This is only for training. Deploying the AI model in a real environment to answer questions or take actions — this is called “inference” — consumes even more energy than training. According to estimates by chip giant Nvidia, models like GPT-3 spend 80% to 90% of their costs on inference rather than training.

The reason AI training and inference require so much computing power mainly involves three aspects: dataset expansion, parameter growth, and the law of diminishing returns for models. Generally, the more data, the more a model learns, which is similar to human learning; however, unlike human learning, when iterating over larger datasets multiple times, the energy consumed will also increase rapidly.

When model parameters increase, the connections between artificial neurons increase exponentially, leading to a surge in required computational volume and energy. In a previous test case, the number of model parameters increased fourfold, while energy consumption increased by 18,000 times.

Worse yet,models are not necessarily better when they are larger; they also face cost-performance issues.In 2019, researchers at the Allen Institute for Artificial Intelligence (AI2) published a paper demonstrating the phenomenon of diminishing marginal returns for large models: compared to the original ResNeXt model released in 2015, the model released in 2017 required 35% more computing power, but accuracy only improved by 0.5%.

However, before finding the optimal balance, people still have to strive to pile up computing power. An article published by OpenAI stated that from 2012 to now, the computing power used for artificial intelligence has increased by 300,000 times, meaning that roughly every 100 days, the computing power for AI doubles.

This is probably the new Moore’s Law of the AI era.

02

Computing Power: Moore’s Law in the AI Era

In 1965, Gordon Moore, co-founder of Intel, proposed an empirical rule stating that the number of transistors that can be accommodated on an integrated circuit would double every two years. This means that after 20 years, the number of transistors on the same size integrated circuit would increase by 1000 times; after 40 years, it would be 1 million times.

Today, the information age we are in is built on the foundation of Moore’s Law. It has always been an important driving force for the development of computer technology.

In a sense, the impetus brought by Moore’s definition is only an “external cause”. The development of computer technology also requires some “internal causes” — it comes from human nature: play.

The desire for “games” and “possession” has been etched in our genes since before the species “humans” even came into existence. Not long after computers were invented, games became one of their important uses. As early as 1952, American computer scientist Arthur Samuel wrote the first checkers program on an IBM computer. Later, he also coined the term “machine learning”. Today, this term often appears alongside “artificial intelligence”. In 1966, American computer scientist and Turing Award winner Kenneth Thompson, in order to continue playing the “Star Trek” game he developed, wrote an operating system and conveniently designed a programming language. That operating system became Unix. Today, operating systems like Linux and macOS on computers, and Android and iOS on mobile phones can be considered its close relatives. And that programming language is the famous C language.

Unexpected Results of Technological Evolution: How Games and Cryptocurrency Became AI's "Computing Power Base"?

Image copyright, no unauthorized reproduction

In 1982, IBM launched the personal computer (PC). The emergence of PC games was a natural outcome. Faster hardware would spawn more powerful software, and stronger software would compel hardware upgrades; the two intertwined like vines. In 1992, the popular 3D game “Wolfenstein 3D” was born. In 3D games, the difficulty of rendering calculations is not high, but the requirements for computation speed are very high. In these games, environments and characters are constructed from many polygons. Their shapes and positions depend on the 3D coordinates of the vertices. The graphics card needs to perform matrix multiplication and division operations on many vertices to determine how these models should be accurately presented on the flat screen; then, calculations must be made for each pixel to determine its color. These calculations need to be very fast because 3D games often involve changing scenes.

Fortunately, these calculations are not very difficult, and most are independent of each other. Therefore, graphics cards designed specifically for display should excel at completing these parallel computations and can quickly transfer data. This demand led the core graphics processors (GPUs) of computer graphics cards to take a different path than computer CPUs. GPUs can be optimized specifically for image processing.

As we entered the new century, the signs of Moore’s Law failing became increasingly evident. Processing technology gradually approached physical limits, transistors became smaller and harder to manufacture and integrate, and heat dissipation and power supply also became increasingly problematic. Thus, multi-core gradually became the mainstream solution; whether for CPUs or GPUs, they have been racing towards multi-core.

Then, Bitcoin emerged.

Cryptocurrencies represented by Bitcoin are computed, a process called “mining”. Mining requires a large amount of parallel computing power, executing millions of operations per second. During times of rising cryptocurrency prices, “mining” became a lucrative business activity; in pursuit of more wealth, frenzied “miners” even bought up graphics cards to the point of scarcity — and this demand further stimulated the need for breakthroughs in computing power.

When chip manufacturers initially developed GPUs, who would have thought that many years later, these “gaming devices” would actually be used for “mining”?

03

Technology Has Its Own Arrangements

What unexpected things are there beyond this?

In 2010, the U.S. Air Force bought about 2000 Sony PlayStation 3 game consoles. Was this to train pilots through gaming, or simply because the officers wanted to play games?

Neither.

After an operation by physicist Guarav Khanna, these game consoles were connected together to form a supercomputer specifically for processing high-resolution satellite images. Its floating-point computing performance was at least 30 times stronger than the strongest graphics card on the market at that time. Even more than a decade later, the strongest consumer-grade graphics card can barely reach one-fifth of its performance.

This is clearly something that neither Sony nor gamers anticipated. However, it is not difficult to understand. Game consoles are optimized for gaming — the chip used in PlayStation 3 has independent CPUs and GPUs working together, capable of utilizing 8 cores to complete dual tasks, and can share information among all cores.

Today, AI also needs these capabilities. The main technology of AI today is deep learning, and the fundamental idea of deep learning is “connectionism”: although individual neurons in a neural network do not possess intelligence, a large number of neurons connected together often “emerge” intelligence. The key is that the number of neurons must be large, and the scale of the neural network must be big — one of the key factors in enhancing model capability is the change in network scale.

Clearly, the larger the network scale, the higher the demand for computing power.Today, large neural networks typically use GPUs for computation. Because the algorithms used by neural networks often involve a large number of parameters, which are updated during each training iteration. The more content that needs to be updated, the higher the requirements for memory bandwidth, one of the advantages of GPUs. Moreover, the training algorithms of neural networks are often relatively independent and simple at the neuron level, so they can also leverage the parallel computing capabilities of GPUs to accelerate processing.

Unexpected Results of Technological Evolution: How Games and Cryptocurrency Became AI's "Computing Power Base"?

Image copyright, no unauthorized reproduction

This is certainly not the design purpose of graphics cards. However, inadvertently, graphics cards have become the infrastructure of the AI era. It is precisely games and cryptocurrencies that, to some extent, helped later AI lay down such a “computing power base”. In a sense, this is the arrangement of technology itself.

04

Technology Is Always Unexpected

Today, AI has begun to drive social and industrial change.Without graphics cards, we might not have seen AI enter our lives so quickly. And graphics cards, born from people’s passion and innovative spirit, especially the pursuit of gaming and cryptocurrency, mark a somewhat unexpected beginning.

Famous science writer Matt Ridley said in his masterpiece “How Innovation Works”,technological innovation, like biological evolution, has no specific direction; only after a process of survival of the fittest will the most suitable technology develop and grow.Once a certain technology becomes mainstream, it will continue to improve itself. Technology seems to have turned into a unique organism with its own development direction. As technology progresses, those popular technologies will continue to accumulate, and the speed of development will become increasingly rapid.

Kevin Kelly also shares a similar view. In his book “What Technology Wants”, he discusses thatthe development of technology is not linear, but full of twists and turns; the evolution of technology is often complex and uncertain, and future developments are often surprising.

Therefore, the energy consumption issue of AI may have unexpected solutions. People are already beginning to explore ways to make AI less energy-consuming, such as reducing precision, model compression, model pruning, and actively exploring the application of renewable energy technologies to provide more environmentally friendly energy. This is certainly a good start.

Leaving this problem for AI to explore may yield surprising answers!Author | Mammoth, Harbin University of Science and Technology

Reviewed by | Yu Yang, Head of Tencent Security Xuanwu Laboratory

The cover image and images in the text are from copyright librariesImage content not authorized for reproductionFor original text and images, please reply “Reprint” in the background

Unexpected Results of Technological Evolution: How Games and Cryptocurrency Became AI's "Computing Power Base"?

Leave a Comment