In early December 2024, a promotional video appeared online, featuring Dr. Zhang Wenhong, the director of the National Center for Infectious Disease Medicine and head of the Infectious Diseases Department at Huashan Hospital affiliated with Fudan University, passionately promoting a certain food product.
Many netizens, trusting Dr. Zhang, not only made purchases themselves but also actively shared the video in family groups.However,Dr. Zhang quickly clarified that the video was not recorded by him, but was likely fabricated using AI.
Screenshot from the internet
In recent years, with the rapid development of artificial intelligence technology, such incidents of AI-fabricated videos have frequently come to light. Criminals have exploited AI technology to impersonate celebrities for fraud, publish false advertisements, and even create fabricated videos to swindle large sums of money, severely threatening the social trust system and information security.
Today, we will analyze such incidents from a technical perspective.
Technology Behind AI Fabricated Videos:
From Generative Adversarial Networks to Deepfake
The technology used for these AI-fabricated videos is not new; research on face swapping and image generation began in academia in the 1990s. In 2014, Ian Goodfellow proposed Generative Adversarial Networks (GANs), which enabled computers to generate more realistic and high-quality images.
Generative Adversarial Networks consist of two parts: a generator and a discriminator. The generator creates content, while the discriminator assesses whether the created content is real and provides feedback to the generator. During training, the two engage in a cat-and-mouse game, enhancing each other’s capabilities, ultimately allowing the generator to produce highly realistic images.
In 2017, someone created an account named “deepfakes” on the online community Reddit and posted several celebrity face-swapping videos made with GAN technology. From then on, the term “Deepfake” began to be widely used, typically referring to the technology behind such face-swapping videos.
In January 2018, a desktop application called FakeApp was released. Soon after, open-source tools like Faceswap and DeepFaceLab emerged with similar functionalities. These tools significantly lowered the barrier to entry for creating Deepfake videos, allowing ordinary users without specialized knowledge to easily generate such content.
With continuous advancements in related technologies, today’s Deepfakes not only produce higher resolution videos with more natural facial expression synchronization but also require less data and shorter training times.Early Deepfake training often required hundreds to thousands of images of the target person or videos ranging from minutes to hours to capture facial features from different angles and expressions. However, with the development of GAN technology itself, techniques like Transfer Learning and Few-Shot Learning have emerged, allowing Deepfake videos to be generated from just dozens or even a single photo.
Copyrighted images, reproduction may lead to copyright disputes
Additionally, early Deepfakes could only generate images; now, combined with voice cloning and voice style transfer technologies, they can also generate indistinguishable target voices and synchronize lip movements with the audio.
In simple terms, now, all it takes is a single photo and a few seconds of audio to create a Deepfake video. Of course, with more photos and longer audio, the generated video will be even more realistic.
Positive Applications and Negative Impacts of Deepfake Technology
Although Deepfake is a form of “fabrication,” it can have many positive applications when the consent of the “fabricated” individual is obtained. For example, in 2019, British football star David Beckham released a video calling for the eradication of malaria. In the video, he used nine languages, including Swahili and Yoruba, while all but English were generated using Deepfake technology.
Moreover, the various technologies used in Deepfake have broad applications in areas like digital humans, video streaming, film production, education and training, and psychological rehabilitation.
However, any technology can be misused. Before the previously mentioned fake Dr. Zhang Wenhong incident, there had already been cases of fake Jin Dong, fake Jack Ma, fake Lei Jun, and fake Yu Donglai. Impersonating celebrities to sell products is not the worst; Deepfake technology has also been used in many more malicious ways, such as fraud.
On December 20, 2024, the BBC reported a story titled “Love Scammer Uses Deepfake to Scam Me Out of £17,000.” The victim was 77-year-old Nikki MacLeod. The scammer told her he worked on an oil drilling platform and asked Nikki to purchase Steam gift cards and transfer money via PayPal to obtain internet access on the platform and cover travel costs to meet her in Scotland. Nikki was initially skeptical but believed him after seeing the video he sent from the drilling platform.
In January 2024, an employee from a Hong Kong company transferred $25 million from the company’s account to a scammer. The scammer conducted a video call with him, posing as the Chief Financial Officer, and during the call, the employee not only saw the “CFO” but also other “colleagues.”
According to a report published by the renowned accounting firm Deloitte in May 2024, Deepfake fraud in the U.S. increased by 700% in 2023, resulting in losses of $12.3 billion, a figure that could reach $40 billion by 2027.
Copyrighted images, reproduction may lead to copyright disputes
Besides fraud, Deepfake technology can also be used to bypass facial recognition systems. Many mobile applications use facial recognition to verify user identity. To prevent impersonation using photos, modern facial recognition technology typically performs liveness detection, such as 3D structured light liveness recognition, which requires special hardware, but only some phones support it.
Currently, many applications still use 2D facial recognition based on the front camera of mobile phones. 2D facial recognition liveness detection primarily requires users to blink, nod, and perform other actions, as well as flashing specific color lights on the screen. As a result, some individuals have used Deepfake technology combined with other methods to deceive facial recognition systems and steal online accounts.
How to Identify and Detect Deepfake Videos?
Poorly made Deepfake videos can often be identified with the naked eye. For example, unnatural facial expressions or eye movements, too few blinks, blurry facial edges, or unnatural transitions with the background, and inconsistencies in the lighting effects on the face compared to the surrounding environment. However, as Deepfake technology advances, these abnormal features are becoming less common.
Currently known Deepfake technology can imitate facial expressions but cannot replicate the deformations that occur when a face is subjected to external pressure. Therefore, during a video call, if you suspect Deepfake fraud, you can ask the other party to press one side of their nostrils with their index finger or press one side of their cheek.
In addition to visual identification, using AI to recognize AI-generated content is also a popular research direction. For instance, synthetic videos may have discontinuities between frames, and performing temporal consistency analysis may reveal anomalies. Additionally, a person’s heartbeat can cause subtle color changes in the skin that are rhythmically consistent with their pulse, and Deepfake videos may lack this feature.
However, we must also recognize that the GAN technology used in Deepfake inherently consists of a generator and a discriminator, meaning that any detection technology can also be incorporated into the Deepfake’s discriminator, thereby generating content that is difficult to detect.
Copyrighted images, reproduction may lead to copyright disputes
In recent years, China has implemented regulations such as the “Internet Information Service Deep Synthesis Management Regulations” and the “Interim Measures for the Management of Generative Artificial Intelligence Services,” but these laws mainly constrain the platforms providing related services. With the advancement of software and hardware technology, Deepfake can now be completed using models running on personal computers.
Therefore, to address the various issues posed by Deepfake, a comprehensive governance mechanism is needed, requiring multilateral collaboration among technology, platforms, and law in the future. Impersonating celebrities using Deepfake to deceive others into purchasing products not only potentially violates Articles 1019 and 1024 of the Civil Code, infringing on others’ portrait and reputation rights, but may also involve fraud under Article 266 of the Criminal Law and false advertising under Article 222 of the Criminal Law. Actions involving the use of Deepfake that are illegal or criminal must be prosecuted in accordance with the law.
Planning and Production
Author丨Yu Yang, Head of Tencent Xuanwu Laboratory
Reviewed by丨Yu Naigong, Head of the Robotics Engineering Program at Beijing University of Technology, Director of the Robotics Research Center at the Beijing Academy of Artificial Intelligence, Doctoral Supervisor
Zhao Jingwu, Associate Professor at the School of Law, Beihang University, Deputy Director of the Key Laboratory of Law and Strategy for Industrial and Information Technology of the Ministry of Industry and Information Technology, Deputy Director of the International Governance Research Base for Cyberspace
Editor丨Lin Lin
Proofreader丨Xu Lai
Related Articles
1. A stone actually grew hair?! But the truth is more terrifying than you think…
2. “Drinking more water” really works! Latest research shows that drinking more water helps with weight loss, treating kidney stones…
3. Why do women always complain of pain? It’s not just fussing, but…
4. The must-eat miracle dish in winter! Nutritious, delicious, and cancer-fighting, promise me you’ll try it!
5. Why I don’t recommend you to “bare your heart” with your partner in everything?
The cover image and images in the article are from copyrighted stock images
Reproduction may lead to copyright disputes; for original images and text, please reply “Reprint” in the background
Light Up “Reading”
Let’s gain knowledge together!