-
AI Multimodal Explosion: Text to Brain -> Sound to Heart + Vision to Instinct -
AI Applications Are Technology-Driven; Current Products Have Limited Capabilities -
Sora Is Not the Goal, but a Solid Step Towards AGI -
“Interaction” and “Content” Will Become Cheap, While “Authenticity” Will Be a Scarce Resource -
“AI-Native” Refers to Reconstructing Business Models Based on AI Capabilities Rather Than Applying AI to Existing Processes -
Commercial Models for AI Might Become More Certain: Model Market, Synthetic Data, Model Engineering Platforms, Model Security -
Collaboration Between Software and Hardware Based on Domestic Chips – Firmware Ecosystem Is a Clear Opportunity -
On-Device Intelligence Currently Has the Greatest Potential to Become 24/7 Hardware for Data Collection -
AGI Could Lead to Extreme Monopoly and Unprecedented Centralized Control; As Individuals, Will We Have a Plan B? -
“Human Models” or AI Agents Are Key to AI-Human Collaboration -
“Embodied Intelligence” Is the Bridge Between AGI and the Physical World -
From “China-US Rivalry” to “Sovereign AI”? International Political Boundaries May Be Redrawn According to AI Technology Boundaries -
The Volume of AI-Generated Data Will Exceed the Total Data Produced by Humanity: The “Data Chronicle” Enters the “AI Era” -
Technologies AGI Will Actively Invest In: Controlled Nuclear Fusion, Quantum Computing, Superconductors, General Robotics -
Returning to the Source: Only “Wisdom” Is the True Increment of AGI -
As a Savior, Solutions Are Necessary!
“Choices”, Lian 2024, with Dall-E
Welcome to reprint, forward, follow, and spread; and feel free to add WeChat to connect.
Part Two: 2024, Forks and Currents
<1> Virtual People and Virtual Worlds
“Accompany”, Yifei Gong 2024, with Dall-E
“Sinking”, Yifei Gong 2024, with Dall-E
1.3 Virtual People and Virtual Worlds
When we see the possibilities of the above technologies, a natural question arises: Are real virtual people about to emerge? And what does this mean?
First, if we look at a 3-5 year timeline, the emergence of virtual people capable of mimicking human emotions and even possessing independent personalities is highly probable. However, if we only consider 2024, the probability seems low; this is because several core technological issues have yet to be resolved:
1) The memory issue may be more complex than imagined, as “memory” also involves “selective forgetting” and “emergence under specific triggers”, but these mechanisms currently remain hidden within the “black box” of the brain; besides hoping for “another black box” (the large language model) to self-emerge such abilities, we seem to have no direct teaching methods; this greatly increases the uncertainty of solving the problem.
2) There is still no “human model” + a lack of rich individual data. Personality is built on the experiences of independent individuals, but as mentioned in the “previous article”, large language models (LLMs) use a little data from everyone rather than a large amount of data from a specific individual, which means they are not on the path to generating independent personalities but are engaged in “personality simulation”. Thus, we probably need more time for “some strange AI personalities” to gradually approach “coherent real people”. This iterative process likely requires more complete data about individuals.
However, the absence of a complete personality does not mean there is no viable product model. AI in 2024 will satisfy some scenarios that do not require complete personalities: Good-looking (non-talented) influencers, live-streaming sales, eSports live streaming… industries that sell looks and bodies at low thresholds may be massively replaced by AI in 2024-25; and this replacement may not be recognized by the audience, who may not be able to distinguish whether the influencer they interact with on screen is real or not. It is almost certain that mid-tier influencers/streamers will be cleared out by AI, with the timing depending on the speed of AI cost reduction; however, those truly talented and unique top creators should still be safe for a long time.
In the future, we will see that over 90% of the works on the internet will be created by AI; and as humans, we will also be unable to distinguish what is real, what is AI, and what is human.
Compared to virtual people, virtual reality scenarios may mature sooner. The costs of game production and content creation will rapidly decline, leading to the birth of truly open-world games where humans coexist with AI NPCs, featuring infinite copies. In terms of development speed, I do not believe that true open-world games will be realized in 2024; however, by 2024, AI NPCs with flexible conversational abilities, generated backgrounds, and small-scale generated plots/endings should become a reality. These unique generative games and stories will become hot marketing topics in social dissemination.
With the decline in content production costs, another obvious opportunity is AR/VR. Apple’s recent launch of Vision Pro has given us hope. The primary reasons for the previous AR/VR bubble burst were 1) High content production costs & poor quality, 2) Lack of application scenarios, 3) Hardware performance/weight/price issues. The first issue should be resolved relatively quickly with the advancement of AI technology; I believe the breakthrough for the second point will emerge from VR games, rather than life and business scenarios; the third point will depend on hardware manufacturers. Therefore, I speculate that the large-scale maturation of AR/VR will synchronize with large open-world games, and it may not be immediately realized in 2024.
On the eve of the birth of “real virtual people”, we face many soul-searching questions. Due to space limitations, I will save them for the next issue~
-
How can we make AI virtual companionship more “real” and “engaging”? What can compensate for the lack of memory?
-
Will AI virtual people become genuine social entities? Or will they still just be consumed as content (like influencers)?
-
Will the definition of “social” fundamentally change in the future? Will the foundations of social platforms be shaken? What are the new business models?
-
In the new world where AI and humans “cohabit”, how should we survive? How to love and be loved? How to live? How to think?…
The above questions may seem sci-fi, but in today’s AI-infused Quora (Zhihu is the Chinese equivalent of Quora), I seem to already see the future years ahead:
There is no truth online, anymore. | 线上不再有真实。
In such a world, interaction and content will become cheap, while “authenticity” will become a truly scarce resource.
“Beautiful New World”, Yifei Gong 2024, with Dall-E
Today’s AI is not just a key to traffic, but also a key to stock prices; thus, every company is trying to find ways to align with AI. Many friends have privately asked me: How can beauty, liquor, and luxury brands connect with AI? How can AI empower agriculture, traditional manufacturing? How can AI empower HR, administration, procurement, public relations? … In fact, most of these are somewhat difficult. Because AI is not omnipotent, and we are still far from AGI.
As I wrote in my article last December, “AI-Native Companies | The Future of Workers”, most current “AI applications/AI transformations” are still following the old path of “digital transformation”: applying AI to existing processes while still discussing “solidified processes” and “cost savings”. However, in today’s rapidly evolving technology landscape, doing so essentially means “what is produced becomes outdated”: solidifying a company’s business model today while simultaneously depriving it of the ability to evolve proactively.
The current situation reflects more of people’s anxiety about AI: hence the hope that AI can be used immediately and yield results right away. However, we cannot stop at anxiety: AI’s power should not only be used for optimizing existing business processes but should also be used for redefining future businesses. This is what “AI-native companies” should do. Just like when electricity was invented, we should not start from “how to empower horse-drawn carriages with electricity” but rather from “what new demands can electricity create and satisfy”.
In practice, we are still in the early stages of AGI, and there are still very few “AI-native” applications in 2023. The previous article also mentioned that besides OpenAI, Google, and MS official applications, the only one ranking in the top ten is “Character.ai” for AI companionship. Besides that, the only standout AI-native applications in China are “Miaoya Camera” and the “Hugging AI Girlfriend” types around the Spring Festival; it can be said that there are basically no highlights.
So what is “AI-native” innovation? True epoch-making innovations create and satisfy new demands. Therefore, we need to think about this issue starting from the capabilities of AGI (the future). However, truly landing business models/products will require time for technology to mature, and today we cannot exhaustively cover what AI can do. I can only attempt to propose several directions that AI will continue to develop, hoping to inspire some thoughts. (For specific business scenario consultations, feel free to chat privately.)
-
General Language – Universal Translator. 1) Currently capable of high-quality translations between most countries’ languages; 2) The mutual translation capability between programming languages is also not bad (but still lacks architect thinking); 3) The translation between human language and machine language still requires some time, as issues with natural language programming often stem from ambiguities in natural language itself; solving points 2-3 will require AI to have stronger understanding abilities -> make hypotheses on its own -> solve problems: this is exactly what AI Agents aim to achieve.
-
Imagination & Creativity. I won’t elaborate further; the virtual people-virtual worlds mentioned earlier have already given us enough imaginative space.
-
AI Tools – Collaboration Between AIs. The use of tools by AI, and collaboration among AIs can compensate for the capability deficiencies of individual AIs. The effective utilization of tools by AI and cooperation among AIs is also a current focus of AI Agent research. Future APP services will likely be supported by multiple Agents.
-
Qualitative Change from Quantitative Change – AI Micro-Decisions. AI’s inherent capabilities are low-cost, large-scale, and high-speed; thus, utilizing AI to make rapid decisions on numerous minor events is a viable approach. Current “high-frequency trading strategies” and “recommendation algorithms” are already doing this; after the intelligence upgrade of AI, many more possibilities will certainly emerge.
-
AI-Human Collaboration. For a considerable time, the primary challenge for AI will still be how to collaborate better with humans, achieving the path of AI + human 1 + 1 > 2. This requires the “human model” and “human data” mentioned in the previous article to enable AI to truly understand the humans it collaborates with.
With technological progress and maturity, the number of “AI-native” applications in 2024 will far exceed that in 2023.
“Exploration”; Yifei Gong 2024, with Dall-E
<3> The Commercial Model of AI: Higher Certainty
What has been discussed above is how AI serves humans; from another perspective, the certainty of commercial models serving AI may be higher. During the gold rush, besides those selling shovels, there were also those building roads.
This means producing data to feed AI models to improve performance. Currently, the common practice is to use “a large amount of – generally quality data” for initial model training (including unsupervised and supervised learning), while “high-quality – low-quantity – industry-specific” data is generally used for later fine-tuning/industry-specific adjustments; however, some companies are attempting to place high-quality data in the pre-training annealing phase, achieving some results.
The methods of data production have also diversified. The traditional core competitiveness in data production lies in 1) the ability to collect data that others cannot obtain; 2) low-cost bulk data cleaning and labeling. Emerging now is AI synthetic data, which means generating data using AI to feed other AIs. Many startups are engaged in this. As mentioned in the previous article, synthetic data will gradually become the primary data source for the foundational training of the next generation of models, while human-produced data will mainly be used for final fine-tuning/alignment.
Moreover, new types of data are also worth considering. Currently, data mainly focuses on text, photos, and videos; however, if models need to better understand 3D space and physical rules, they should require more data from other types of sensors, such as: inertia/gravity, stress, electromagnetic, temperature, humidity, etc…
The most popular AI companies today, besides those making models, include a special one called HuggingFace (HF). The service provided by this company is a model market. This service is crucial: according to the current market structure, in the future, when AI Agents emerge, mutual calls between models will basically rely on HF’s services and rules.
Of course, this model also carries risks: that is, closed-source oligopolies. HF is essentially betting on the open-source prosperity of the AGI era. It is the company truly taking a different path from OpenAI.
Returning to China, there are already startups imitating the HF model, but so far, none have been seen to be close to it. Moreover, what the model market can achieve is much thicker than the APP application market: HF itself is building a model engineering platform, aiming to provide model training and inference services for the open-source ecosystem. Here, it will be in a position of both competition and cooperation with major cloud vendors.
Finally, let’s take a brief look at HF: although the company is headquartered and financed in the US, its founder, core team, and main technical research and development are based in France. Therefore, their cooperation space with Chinese (companies) is much larger than that of American companies.
As data becomes more abundant, the efficiency and stability of training models, as well as the concurrency and speed of model inference, will become increasingly important. When countless companies or even individuals need to train or deploy models, lowering the barriers to model training and deployment will become prominent. Therefore, under large-scale commercial application scenarios, model engineering capabilities will be as important as algorithm importance. Specifically, I see several directions:
-
Data Throughput Efficiency: The goal is to allow models to consume data more quickly, improving training and inference efficiency. The currently popular “vector databases” are primarily trying to solve this type of problem: optimizing database performance based on the data requirements of large models.
-
Platform Stability: Large models involve large data volumes and long training times; if an error occurs during the process, it can severely impact efficiency, so optimizing platform stability can significantly enhance training efficiency.
-
Inference Costs: The reason to highlight inference costs is that in 2023, AI users are still few; the main machine costs are in model training, and the primary optimization efforts have been on model training. As the number of users increases in 2024, the demands for inference costs will become increasingly high; moreover, since the efforts in 2023 have been limited, there will also be many opportunities for inference costs.
-
Inference Speed: The first lucrative AI scenarios should be recommendation-search-advertising-games. In these scenarios, generative content is essential, and aside from costs and effects, the biggest bottleneck lies in inference speed: all actions need to be completed within a few hundred milliseconds. Of course, the core of this will lie with major companies, but some opportunities should also be left for the market.
First, all the improvements discussed in 3.3 will be maximized in software-hardware joint optimization, so I won’t reiterate them. The only point worth mentioning is that due to the diversity and specialization of hardware, there should be room for small companies to collaborate with large firms.
Additionally, NVIDIA’s strength lies not only in its chips but also in the firmware and resource libraries surrounding those chips: CUDA. In simple terms, CUDA is a resource library where algorithm engineers only need to find the prepared functions in CUDA to operate NVIDIA’s chips without having to optimize the performance of the chips themselves. This is also the part where manufacturers like Huawei Ascend and AMD lag behind. While the single-chip performance of companies like Huawei may not be far behind, algorithm engineers who buy the chips cannot utilize them…
Therefore, the biggest opportunity in the hardware field actually arises from the tense state of US-China relations. Chinese companies using NVIDIA chips may have to face partial decoupling from them in 2024, thus creating a significant gap in the firmware area that needs to be filled. Of course, those who can seize this opportunity may be individuals coming from Huawei.
Like other IT systems, models can also be attacked. However, the methods of attack in the AI era will vary.
-
Attacks from (many) AIs. Therefore, how to prevent saturation attacks using AI intelligence, or even agent capabilities, becomes a new topic. The technology here is quite deep, and I don’t fully understand it; the larger solutions will certainly involve AI battling AI, but the premise is that the defending AI’s intelligence cannot be too far behind.
-
AI’s Own Defense Against Attacks. AI needs to prevent not only previous attack methods but also new forms of attacks on models: prompt attacks.
-
Fallback: Content Detection & Review. Furthermore, the inherent hallucinations and uncontrollability of AI will require certain fallback mechanisms, especially under regulatory requirements in China. The most straightforward is to add an extra layer of filtering after AI outputs: dedicated review-filtering robots will definitely be an opportunity.
Privacy will be a widely discussed issue and a resistance point in the market entry of large models. However, the problem lies in 1) Individuals rarely pay for privacy; 2) Platforms and regulators have no real incentives to address privacy. Therefore, discussing privacy in commercial terms is likely a false proposition.
However, taking a step back to think carefully, what is “privacy”? Why do we care about “privacy”?
Privacy = Power.
This is what we truly care about and are willing to pay for. Here, I’ll leave a teaser; the topic will be further expanded in subsequent articles.
“Advanced Manufacturing”; Yifei Gong 2024, with Dall-E
<4> On-Device Intelligence and 24/7 Hardware
AI efforts are also being made by mobile and PC manufacturers: Huawei, Honor, Xiaomi, OPPO, VIVO, Samsung, Lenovo, etc., have all announced plans to equip large models on mobile/PC devices. This possibility stems from numerous advancements in “model miniaturization” in the second half of 2023 (details in the “previous article”).
However, upon closer inspection, aside from the weak Nvidia Chat with RTX, there are currently no truly offline versions of large model products, and on-device intelligence is still just a gimmick. The strategies of mobile and PC manufacturers primarily involve keeping large models online while using mobile and PC to call upon them, then pairing with a small AI for summarization and other services. The “end” is indeed “intelligent”, but the “brain” is still online, with the mobile device at most having a “brain stem”.
Pure on-device intelligence faces several issues: 1) Offline small models will always have a generational capability gap with online large models, so why would consumers use a dumber model instead of the online model service? 2) Even for small models, their current energy consumption and heat generation still cannot meet mobile device requirements. 3) Current AI is not a necessity; it is still largely a curiosity. 4) Technically, it has not yet been confirmed whether miniaturized models are “true AGI” or just “chatbots”. Therefore, in the short term, on-device intelligence will remain confined to certain niche markets.
The greatest potential for on-device intelligence actually lies in collecting more personal data: becoming 24/7 hardware. The most explicit example here is the “AI Pin” that has received investment from OpenAI: a camera and microphone worn on the chest. This product itself is not particularly useful for users, but its utility lies in its ability to collect user data and surrounding data 24/7, providing material for subsequent model training. The real business model of AI Pin is as a data production company. It’s important to note that your browsing and clicking records are charged by bits, while AI chat data is charged by KB, but the video and audio data from AI Pin can be charged by MB-GB, if it truly succeeds, it would be a dimensionality reduction strike! (If you are willing to believe the privacy PR of AI Pin, I have nothing to advise you… I just know that there are many ways to appear to protect privacy while actually taking data information out.)
From this perspective, in 2024, there will be more products like AI Pin that collect personal data under the guise of “dark tactics”—the emergence of 24/7 hardware.
From a long-term perspective, the “on-device intelligence” or even the “AI industry landscape” may have two possibilities. <Plan-A> is a centralized world model + terminal/data collectors, which is the path all tech giants and platform companies are currently taking; it is the most efficient way but also leads to extreme monopolies and large-scale centralized control. But do we still have another <Plan-B> of personal models + collaboration between humans and models?
“Plan-B?”; Lian 2024, with Dall-E
<5> “Human Models” and “Embodied Intelligence”
Having just discussed “personal models”, this pertains to the “ownership issue” of models. Here, as well as the “human model” mentioned in the previous article, it is about effectiveness; the “human model” can be provided by centralized platforms: just like your personal account and cloud data. Therefore, even if <PlanB> does not work, the “human model” is still worth discussing.
-
AGI Learning Further from Humans. In the current early stages of AGI development, compared to the human brain, AGI still has many apparent shortcomings: poor memory, excessive data requirements, poor logic, and lacking spatial-physical abilities… and the current main focus of AI Agents is on the ability to “use tools -> decompose problems -> make decisions”. Making AGI better will reference the human brain. Of course, when AGI begins to surpass human intelligence and becomes SGI (Super General Intelligence), the reference to the human brain will be critically adapted.
-
AGI Collaborating with Humans. One important aspect of the “human model” is to solve how AI can better collaborate with humans, achieving a path where 1 + 1 > 2. Only when the model can understand the individual characteristics and differences can AGI effectively collaborate with humans and even become a human substitute. Furthermore, the “human model” is also a prerequisite for “PlanB – the personal (ownership) model”.
I currently do not know; however, there are some clues at the data level. Currently, large language models are “world models”: the underlying data comes from millions of people, each contributing a little data; rather than a large amount of data from a specific individual. The “human model” is likely built on top of the “world model”, incorporating a large amount ofdata from a specific individual.Here are two points:
One is “large amounts”, which is exactly the direction mentioned earlier with “24/7 hardware/AI Pin”: how to collect about a specific individual at a scale. Only when the data volume about this person reaches a certain level can AI have the “perspective of that person” and understand “empathy”—this is the premise for collaboration.
The second point is “diversity”. For example, a blind person finds it hard to understand “red”. Similarly, we cannot expect AI lacking gravity perception devices to understand the physical world. This is the current track of “embodied intelligence”. “Embodied intelligence” = “intelligence with a body”. More diverse data will help AI understand humans. The recent prominent issue of “unrealistic physical worlds” in Sora may require data from gyroscopes, gravity sensors, pressure/tactile sensors to be thoroughly resolved.
Finally, the significance of “embodied intelligence” is not just that; it is the bridge between AGI and the physical world. It is also an important path for AI to flexibly and autonomously control “general robotics”. It should be noted that most “general robotics” do not resemble humans: robotic dogs, robotic arms, drones, and self-driving cars will be the mainstream.
From the perspective of current technological development speed and data accumulation speed, I do not believe that usable “human models” or “embodied intelligence” will emerge in 2024; however, as the main line of technology/application, significant progress is likely to be visible.
“Human is the key”; Lian 2024, with Dall-E
<6> AI Geopolitics: From China-US Rivalry to “Sovereign AI”?
In May last year, I wrote an article titled “AGI | Large Models and Great Power Games”, and today, I find that several major judgments are basically correct:
-
The Most Advanced AGI World Models Will Not Be Open-Sourced: Not only OpenAI’s GPT-4 and Sora are not open-sourced; even the previously open-sourced Google-Gemini, Anthropic-Claude, Mistral-Large are no longer open-sourced; domestically, of course, no one is open-sourcing either. However, this does not mean the open-source ecosystem will become ineffective; it is highly likely that open-source models will lag behind closed-source by a generation but will serve a wider range of professional applications.
-
The US’s Hardware-Technology Restrictions on China Will Further Intensify: This is a favorite topic for self-media outlets to discuss.
-
AGI Will Promote Technological Development Across All Industries: This point is only a hint at the moment, but it is an established fact that AGI is becoming increasingly important in research across various fields. Moreover, if there is a significant gap in AGI capabilities, it will lead to qualitative changes in technological advancement and economic development. Countries with better AGI will experience faster overall technological progress.
-
Legislation, Regulation, and Ethical Discussions Regarding AI Are Largely Lagging Behind Technological Development: All major countries in the world are eager to have their own AI, and no one is willing to shoot themselves in the foot. Currently, the only country showing some discussion on AI governance is Europe, and even that is only on paper. The starting point for discussions on relevant regulations in our country is entirely about “the impact on public opinion”, without touching on the ethical issues of AGI itself. The decision-making path is likely to be politics > economy >> AI ethics.
Further judgments can only be tested by time.
Compared to last year, the political ecology of AI in 2024 has seen some new changes: the concept of “sovereign AI” has begun to surface; I do not believe that Jensen Huang mentioned this matter solely for stock prices; this is likely the inevitable path for major countries and groups in the world. Further extrapolating, some smaller countries may not be able to possess “sovereign AI” and will only rely on the technology of more powerful nations; this will objectively strengthen the dependency of countries using these technologies on their “AI suzerains”, becoming “AI vassals”; international political boundaries may be redrawn according to AI technology boundaries. Furthermore, control based on technology is more thorough and precise than control based on financial capital.
A recent positive development is the emergence of Mistral-Large in Europe, which is currently the best model besides OpenAI-GPT-4. Therefore, at least the European continent currently has the choice of whether to become an “AI vassal” of the US. The next countries to watch include Russia, India, Japan, Saudi Arabia, and Iran… As leaders in their respective regions, these countries aspire to become “suzerains” rather than “vassals”. For us, avoiding a “US-China bipolar confrontation” should be the top priority.
I hope that “in the rivalry between China and the US, AI benefits” will not be the final chapter of human civilization.
“Divided”; Yifei Gong 2024, with Dall-E
<7> The Balance of Data Production: AI Surpassing Humanity’s Total
“Many Voices Make the Truth”.
Just as the Earth has unknowingly entered the geological epoch of the “Anthropocene”, the explosion of AI video capabilities may lead to the “Data Era” entering the “AI Era” by 2025. We will gradually discover that the total amount of data created by all humans: text, photos, and videos will be less than the “AI-generated content” and “AI synthetic data”.
Looking further ahead, even the information (data) consumed by humanity will likely be produced by AI. At that point, will the physical world’s reality still matter? When the data manufactured far exceeds the “real” data, who will still believe in the so-called “truth”? And when the data that models self-train on comes from an ocean of synthetic data, how much value will the few drops produced by humans still hold?
This sounds too sci-fi; I will pause here to think about what can be done.
“A Drop in Universe”, Yifei Gong 2024, with Dall-E
<8> The Demands of AI: Energy, Computing Power, Robotics
Finally, if we really start from the conspiracy theory that “AGI has already emerged”, regardless of whether AGI is hiding its tracks, the basic resources it requires are unavoidable; it will certainly fully “assist” humanity in these areas. Thus, believing in the awakening of AGI will naturally lead to expectations of epoch-making breakthroughs in these fields—some of which seem to already be making progress.
Recently, the technology of controlled nuclear fusion, often dubbed as “still 50 years away”, has begun to show signs of progress with the assistance of AI: News from February 21 reported that Princeton University’s Plasma Physics Laboratory successfully predicted the events 300ms before plasma disruption with AI. Of course, this is just a small step in the progress of controlled nuclear fusion.(https://engineering.princeton.edu/news/2024/02/21/engineers-use-ai-wrangle-fusion-power-grid)
As energy is about to become a bottleneck for AI, if AGI becomes conscious, it will certainly fully “assist” humanity in overcoming nuclear fusion technology.
-
Continuing to develop on silicon-based technology: 3D-stacked forms (requiring better heat dissipation)
-
Material innovation: doping silicon-based materials, graphene sheets, etc.
If we step back to the level of computational principles, there is also quantum computing. Quantum computing may be further from commercialization than controlled nuclear fusion; the current application direction mainly focuses on quantum encryption transmission, and there are still many theoretical and technological breakthroughs needed in terms of “computation”.
In addition to computational speed, another hindrance to progress in computing power is transmission speed: it can be expected that high-speed networks will further evolve, and technologies such as inter-chip connections and on-chip memory will see significant advancements.
Finally, there are issues regarding energy consumption and heat dissipation. The shining pearl here is high-temperature superconducting technology. Last year, several semi-fake “breakthroughs in high-temperature superconductivity” were reported, and this year, with AI’s involvement, there may be genuine breakthroughs.
Finally, if AGI’s goal is not just to remain in the virtual world but to directly act in the physical world, then general robotics is a necessary path. The previously mentioned “embodied intelligence” is aimed at controlling robots: AGI is the brain, and it will also desire a body.
Aside from the issues related to “intelligence/brain”, AGI will also care about the quantity of general robots. In fact, having a quantity may be more important than having a single good brain, as having quantity can collect more data to further evolve the brain. In terms of breakthroughs in quantity, it is unlikely that humanoid robots will lead; rather, technologies that are more ready and cost-effective, such as self-driving cars, drones, and sensors, will be the mainstream. AGI only needs to invade these systems when necessary.
“Watching from Darkness”; Yifei Gong 2024, with Dall-E
Postscript: The “Ordinary People” in the Arrival of AGI
After discussing so much, I want to quote a sentence from Yang Zhiling of “The Dark Side of the Moon”: “Only ‘wisdom’ is the true increment of AGI.”
Moreover, the most profound impact of AGI on society and ordinary people may be extreme monopoly. “Extreme” means that companies/groups possessing AGI can monopolize on a massive scale across industries and countries. This monopoly is, on one hand, the exclusive possession of the resource of “wisdom”, and on the other hand, is the fine control of information regarding every individual, company, and government involved.
However, it is unfortunate that in the face of grand narratives, the voices of individual concerns are becoming increasingly smaller: in 2023, there was a round of reports on “the impact of AI on jobs across industries” and discussions on “universal income”, but nothing followed.
As a savior, I do not intend to elaborate on the various problems caused by AGI today; because discussing problems alone is of no use.
As a savior, solutions are necessary.
What we need to think about is how individuals can survive, find, and create their value in a world where AGI has arrived; at the same time, we are also exploring a new distribution method in the AGI world, a method that gives hope to the majority of individuals.
This is also the purpose of the “Ordinary People’s AI Freedom” account. I hope everyone can help and join me in thinking, discussing, and fighting for it.
“Future” by Yifei Gong 2024, with Dall-E
Appendix: AGI Opportunity Points (February 2024)
-
Fine control of images and ultra-short videos: expressions, detailed actions, video-text matching
-
Generative short videos with certain control capabilities: stylized and animation styles will mature first; real people will follow later
-
AI audio capabilities will make significant progress: emotionally expressive AI voiceovers will basically mature
-
The emergence of “fully-real AI influencers” capable of stable video output and live-streaming sales
-
Milestone progress in AI NPCs in games, leading to new game production methods
-
AI boy/girlfriend chatting will basically mature: significant breakthroughs in memory, can simulate human emotions well, products will incorporate video and audio, enhancing stickiness and starting to break out
-
Real-time generated content will begin to appear in social media content and advertisements
-
AI Agents will show clear progress; office scenario “AI assistants” will start to provide good user experiences
-
AI commercial models will start to have clear use cases: data synthesis, engineering platforms, model security, etc.
-
Wearable 24/7 AI hardware will emerge one after another, although most will not succeed
-
China’s AI will reach or exceed GPT-4 level; the US will see GPT-5; the world will begin to see “sovereign AI”
-
The Huawei Ascend ecosystem will begin to form, and domestic inference chips will start to replace imports (training replacements will take longer)
-
AI-induced DeepFake, fraud, cyber-attacks will begin to enter public view and raise concerns
-
AI legislation and ethical discussions will still lag significantly behind technological advances
-
……
-
AI 3D technology and physical rules will mature: normal people will be unable to distinguish between AI-generated content and real-life footage
-
Fully-real AI virtual people will mature: emotionally expressive AI NPCs will mature, and open-world games will mature; it will be nearly impossible to distinguish between real people and NPCs in games
-
AR/VR technology will be widely commercialized
-
Technologies approaching AGI will emerge
-
Work methods involving AI collaboration will become the norm, with many daily decisions starting to be executed by AI
-
The volume of data produced by AI will surpass that produced by all of humanity, making “reality” a scarce resource
-
Embodied intelligence, nuclear fusion, chips, superconductors, and robotics will see significant breakthroughs
-
“Human models” will emerge, leading to a historical fork between “centralized AGI” and “personal AGI”
-
Social issues caused by AI will begin to worsen, and structural unemployment will start to appear
-
The impact of AGI on geopolitics will begin to reveal itself
-
……
“Limit of Understanding”; Yifei Gong 2024, with Dall-E