Understanding Deepfake Technology and Its Implications

In 2014, Generative Adversarial Networks (GANs) were invented, marking the dawn of a virtual world created by AI. Since then, “AI Deep Synthesis” has gradually transitioned from the laboratory to everyday life, becoming a part of the digital experience for ordinary people. The face-swapping military photos shared on social media, the lifelike virtual anchors, and the celebrity voices in navigation software are all applications of this technology. While AI Deep Synthesis brings novel experiences, it also raises numerous concerns, with the shadow of “Deepfake” lingering. With “Deepfake,” the saying “seeing is believing” becomes outdated, and people worry that this technology will lead us into a world of “truth and falsehood confusion,” where individual security and social trust are at stake.
How do we understand the present and future of this technology? How can we resolve the concerns surrounding Deepfake? How can deep synthesis technology develop positively? “Tencent Dialogue” invited Si Xiao, the head of Tencent Research Institute, along with four experts for a discussion, providing references to answer these questions.

Understanding Deepfake Technology and Its Implications

Dialogue Guests

Yu YangHead of Tencent Xuanwu Lab
Liu WeiHead of Tencent Computer Vision Center / Outstanding Scientist
Zhang WenshengDeputy Chief Engineer of the Institute of Automation, Chinese Academy of Sciences, Chief AI Expert of USTC
Lu WeiProducer of “The King’s Return”

Host

Si XiaoHead of Tencent Research Institute
Understanding Deepfake Technology and Its Implications

Synthesis = Forgery or Creativity?

Si Xiao: What applications of deep synthesis technology exist around us?
Liu Wei: Many people associate synthesis with forgery, but that is not the case. We have many positive applications, such as image restoration, which is also a form of synthesis. Tencent AI Lab is discussing with Dunhuang and the Palace Museum how to use AI technology, specifically GAN technology, to restore ancient paintings, using AI imagination to repair damage.
Lu Wei: The application of deep synthesis in films is increasing. For example, in “The King’s Return,” there are over a hundred characters, and many storylines that need to be visualized in virtual scenes. We also used AI sound effects to establish the virtual personalities of characters. The re-release of the old film “Shangganling” utilized AI for color restoration, with good results. The China Film Archive has many old film resources that can be revitalized through technology, achieving 2K or even 4K clarity.
Zhang Wensheng: Currently, everyone focuses on voice and image deep synthesis, but there are many more things that can be synthesized. For instance, an intelligent traffic system in a city is also a form of synthesis, merging our physical space with digital space.
Si Xiao: Elon Musk mentioned in an interview that humanity might be living in a simulated world because of the rapid advancement of our ability to simulate through technology. One day, we will be able to use a completely digital method to recreate or simulate a world that is indistinguishable from reality. Let’s think outside the box and boldly predict the future of deep synthesis.
Lu Wei: The current film simulation technology can already be indistinguishable from reality. In filmmaking, we use every means to immerse the audience in a virtually real world, from the environment, lighting, air density, illumination, color temperature, to the movements, embraces, and conversations of characters, all must feel real to the audience. Virtual anchors are also emerging, such as Luo Tianyi and Hatsune Miku, which are versions 1.0 of virtual characters. In the next 5 to 10 years, I believe the market for virtual stars will far exceed that of real-world stars.
I believe the future will see the concept of “no boundaries” emerge.Especially in the age of AI and VR, the audience will feel not only sensory experiences like sight, sound, and taste, but also emotional experiences that should be more real and heartfelt. Sometimes I think in the future, in the real virtual world, the boundaries between the virtual and the real will gradually blur.I believe the current CG (Computer Graphics) technology and computing power are continuously expanding our understanding of the world.
Zhang Wensheng: Our requirements for the “depth” of deep synthesis should improve in sync with aesthetic standards. For example, the AI anchors we see started as 2D, then combined voice and image, gradually perfecting lip-syncing. I think in the next three years, it could evolve into a 3D, three-dimensional scene.
As long as there is demand and application scenarios, this technology will have a market. Taking film production as an example, when we shoot dangerous scenes, we can use deep synthesis technology to complete them, reducing the risk of injury to actors. For large-scale scenes, the cost of practical shooting is very high, and sometimes it is impossible to achieve, such as wanting to show thousands of sheep running on a mountain. But with deep synthesis, we can even depict details like the sheep’s whiskers. Additionally, some imaginative content, like attaching animal movements to human bodies, can also be achieved through technology. As long as there are these demands, significant breakthroughs in technology will occur. Ten years ago, we were working on glasses-free 3D, and now, if glasses-free 3D technology is combined with deep synthesis technology, I believe the market will be even larger.
Si Xiao: Where do we see the gaps between these synthesized actions and real human actions? What bottlenecks currently exist?
Yu Yang: The most important aspect that comes to mind is emotions. People watch movies or TV shows and sometimes encounter scenarios where a character’s loved one passes away, and they might seek a witch or wizard to bring back the soul of their loved one for one last conversation. Now, technology allows us to have this kind of magic. In the future, there will certainly be people with such ideas and needs, which will pose new demands on technology.
Liu Wei: The technology we currently call “deep synthesis” uses deep neural networks, whereas previous synthesis methods relied on statistical approaches, which strictly speaking cannot be called “deep synthesis.” Statistical methods involve collecting samples similar to what we want to synthesize and using statistical techniques to “guess” and piece these samples together. Since the invention of GAN in 2014, the development of related applications has accelerated rapidly.
However, in terms of image generation, despite the significant progress in technology over the past six years, we are still far from the level of high automation and high freedom we envision. What does high automation mean? For example, to generate a highly realistic human face image, the resolution of that image needs to be as high as possible. But with higher resolution, the computational power required becomes immense, making it impossible to achieve on a mobile device; it must be done in the cloud using multiple GPUs.
Similarly, in terms of freedom, our current synthesis technology has significant limitations. In simple terms, freedom means being able to generate exactly what we want, which is still very difficult. In the early stages of GAN development, we used noise to generate a human face, but the output result was uncontrollable. Only this year have we been able to specify the attributes of the generated human face. The neural network model parameters we need to train are enormous, and the training methods are also very complex.
Despite the various technical challenges we face, I firmly believe that the images generated using artificial intelligence technology will continue to improve and approach perfection.
Understanding Deepfake Technology and Its Implications

“Deep synthesis itself should be an empowering technology”

Si Xiao: Currently, deep synthesis technology faces stigmatization, with the issue of “deep forgery” being prominent. How can we ensure that this technology is not used for fraud or deception?
Yu Yang: Can fake images or sounds deceive people? Can they deceive machines? This is actually two topics. In some scenarios, humans can be deceived better than machines. For example, if a scammer calls you pretending to be your classmate, you might notice the accent and think, “How can my classmate from Shandong have a Fujian accent?” But in the future, that might change. The scammer may first call your classmate, record their voice, and then synthesize it. After a decade of not seeing each other, you might think, “That sounds just like my old classmate,” and believe it.
Regarding whether machines can be deceived, from a technical perspective, it’s a battle between machine forgery and machine detection. Neither side is likely to achieve a hundred percent advantage, but when it comes to specific business applications, sometimes compromises need to be made. This can actually provide convenience for fraudsters.
Liu Wei: In China, Tencent has made significant breakthroughs in digital forgery detection—or, to be slightly more academic, in information debate techniques. Our accuracy rates for voice, image, and video detection are quite considerable.Of course, as synthesis technology improves, our defensive capabilities must also increase, akin to a duel.
For example, with current image forgery detection technology, we have a process where we first use adversarial generation technology to create highly realistic samples and then provide the data to computers for determination, allowing them to learn through multiple iterations. I remain optimistic about the development of forgery detection technology. Even in international digital forgery detection competitions, the methods used by the winning teams are generally similar. The cost of forgery is far greater than that of detection, meaning that if someone goes to great lengths to forge something, we can solve it quickly.
Zhang Wensheng: This may also involve business models, specifically how to economically incentivize the development of anti-fraud technologies. Additionally, from the perspective of combating fraud, adding semantic recognition may make it easier and less costly. For instance, in the case of phone scams, if the scammer uses synthesized voice, I could immediately ask a question involving privacy, to which the other party would not be able to respond; that’s semantics.
Si Xiao: What further actions should our industry take to ensure that deep synthesis technology operates rapidly in a positive direction?
Yu Yang: Looking at the entire history of human development, it is essentially a process where the power of individuals—be it constructive or destructive—has been increasing. This is unavoidable. In any case, rashly suppressing technological development is not right. Laws should not prohibit the development of technology but rather regulate how it is used positively. I believe this is the key.
Liu Wei: I personally hope to see policies that encourage innovation in artificial intelligence technology, especially in the application of film and television content, to improve efficiency and reduce production costs. I also hope for innovations in cultural education, including colorizing and restoring old photos and films, as well as applications in online education. All of these are correct directions for guidance. With the gradual improvement of legislation and detection technologies, I believe that technological forgery should be a low-probability event, and overall, deep synthesis technology should be directed positively.
Zhang Wensheng: Artificial intelligence itself is an empowering technology. Our work in technology revolves around improving people’s lives and societal progress. Deep synthesis will develop into a new industry, where technical personnel worldwide will contribute their efforts. At the same time, we need relevant regulations to constrain the illegal and socially detrimental uses of technology. Those who create technology must consider how to update it in this “duel” situation, helping governments and citizens use technology to solve problems.
Lu Wei: For the film industry, the core concern regarding deep synthesis is copyright protection. The ownership of virtual assets and characters needs to be defined legally. Beyond technological development, adding copyright management and respect will lead to a healthier industry.
Understanding Deepfake Technology and Its Implications

“Using the warmth of humanity to promote AI to continuously do good”

Si Xiao: What expectations do you all have for the development of deep synthesis?
Yu Yang: Before modern technology, most of humanity lived in a small world. However, modern technology, represented by the internet, allows each of us to connect with the entire world. As technology progresses, deep synthesis technologies can take us further. This technology can create beauty that does not exist in the physical world, allowing us to experience it, and it will inevitably elevate the beauty of human life to the next level.
Liu Wei: I look forward to deep synthesis technology facilitating cultural education and creative industries, nurturing better digital detection technologies. Everything we discussed today, whether synthesis or detection, is ultimately driven by the human brain and creativity. I hope AI will embody more warmth so that we can continuously promote AI to do good.
Zhang Wensheng: I hope technology companies can turn deep synthesis technology into a business model that attracts more technical talent to create better products. Technology companies also have a responsibility to work with the government to monitor the development of deep synthesis technology, ensuring it operates on the right, positive path.
Lu Wei: I hope that through deep synthesis and computer networks, making films, videos, and virtual worlds will become increasingly simple, making our works more heartfelt and allowing more people to scan themselves into the virtual world.I hope that in this world, the brilliance of humanity shines in more corners.

/ Previous Articles

Understanding Deepfake Technology and Its Implications
Understanding Deepfake Technology and Its Implications
Understanding Deepfake Technology and Its Implications

Understanding Deepfake Technology and Its Implications

You are “watching” me??

Leave a Comment