The Logic of Generative AI Illusions and Its Cultural Consequences

Click the above “Shandong Normal University Journal” to follow us.

Author Introduction

The Logic of Generative AI Illusions and Its Cultural Consequences

Jing Yulun

Female, from Yangzhou, Jiangsu, PhD student at Fudan University School of Journalism.

The Logic of Generative AI Illusions and Its Cultural Consequences

Zhang Dianyuan

Male, from Changchun, Jilin, Professor at Fudan University School of Journalism, PhD, doctoral supervisor.

The original text was published in “Shandong Normal University Journal” 2024 Issue 5 “Digital Communication and Social Governance” column, please indicate the source when reprinting.

Abstract: Throughout human civilization, many media have an inherent deceptive nature. It is precisely this deceptive characteristic of media that continuously creates illusions in the process of mediating reality. As a new type of media, generative AI also possesses deception and has surpassed all previous media in creating illusions, specifically manifested in forms such as “AI hallucinations”, “simulated images”, and “deep fakes”. The software systems for automated production based on probability and simulation constitute a complex and massive deception mechanism of generative AI, with interactive interfaces that conceal this deception beneath. Generative AI has constructed a “hyperreal” world, and its illusions correspond to Baudrillard’s prophecy of “fractal culture”. The ultimate cultural consequence is that in the illusion universe constructed by generative AI, the distinction between true and false no longer matters, leading us into a media environment and new cultural forms that “relentlessly produce illusions”. The potential for technological evil may also be unprecedentedly unleashed.

Keywords: Generative AI; Illusions; Media Deception; Hyperreality; Fractal Culture

Reading Guide

1. Reality, Illusions, and Media Representation

2. Media Deception and Illusion Manufacturing

3. The Deceptive Nature of Artificial Intelligence and Generative AI Illusions

4. Automated Production, Interactive Interfaces, and Generative AI Illusions

5. Generative AI Illusions and Hyperreal Construction

In 2023, OpenAI launched ChatGPT, allowing us to witness the rapid development of text-to-text and text-to-image generation. Just a year later, OpenAI released the text-to-video large model Sora. Sora can quickly generate coherent 60-second high-quality videos based on user prompts, presenting not only the content described in the user’s prompts but also logical multi-angle shots and scene transitions. More importantly, the fluency, clarity, and realism of Sora’s generated videos are almost indistinguishable from real-life filming. Unless the video source is marked, users cannot tell that the video is a simulated image generated by Sora. While we marvel at the imagination and creativity of generative AI, we also find that generative AI is not “perfect”; both ChatGPT and Sora produce “generative AI illusions” represented by errors in reproduction, erroneous generation, and AI hallucinations. Sora, which possesses the ability to simulate the physical world, further challenges the definition of “what is real”; the deceptive characteristics of generative AI are evident. So, what do the deception and illusions of generative AI mean in relation to reality? This article attempts to gain insights from the continuous rather than fragmented history of media, rethinking the relationship between reality, illusions, and media representation, examining the commonalities and differences in deception across different media based on the inherent deceptive characteristics of media, and attempting to clarify the relationship between media deception and the manufacturing of illusions such as erroneous reproduction, erroneous generation, and falsehoods, further gaining a deep perspective on the essence of generative AI illusions, the sources of deception, and the potential cultural consequences that illusion manufacturing may trigger.

1. Reality, Illusions, and Media Representation

Before delving into media illusions, media deception, and generative AI discussions, a key question to clarify is: what is reality? From a philosophical perspective, “reality” expresses both a narrative of the overall “truth” of the “world” and a pursuit of comprehensive “truth” for humanity. Modernity and postmodernity philosophers present significant academic divides on the proposition of “what is reality”. Specifically, modernity emphasizes “rationality” and “science”, with the usual interpretation of “Logos”: rationality, judgment, concepts, definitions, reasons, and relationships, meaning “the most real existence” and “the most fundamental truth”, forming the fundamental scale and law to which all humans and things must submit. In the sense of modernity, rationality serves as the ultimate basis and origin for human thought and behavior, enabling conscious understanding and reflection. The natural sciences based on rationality are important products of the modernity process; if a scientific principle yields the same result under identical conditions, then this scientific principle is real. In fact, within the perspective of modernity, the development of industrial society and the progress of human civilization driven by science and technology have led to an unprecedented insistence on uniqueness, homogeneity, and certainty, thus modernity’s viewpoint asserts the existence of absolute reality. Simultaneously, it is the scientific principles and technologies that reproduce the things that truly exist, making reality become truth, thus reality becomes knowable and certain. Therefore, “reality” is always linked to the reproduction of technology, media, and symbols. When we say something is real, we mean that it can be reproduced in the same way. However, the perspectives of postmodern philosophers stand in stark contrast. Postmodernity emerges from the backdrop of post-industrial society characterized by overproduction, committed to deconstruction, attempting to challenge, undermine, and deny all traditional, consensual values and universal laws in society, including the rationality and science emphasized by modernity, replacing them with diversity, heterogeneity, and differentiation. Similarly, they critique the modern view of reality and examine “reality” through the lens of relativism. They argue that there is no singular “reality” in the world; “reality” is relative, heterogeneous, and pluralistic, meaning that objective reality is an unquestionable truth that does not require verification and cannot be verified. They are not obsessed with pursuing a singular reality but are rather wary of it. At the same time, as modern industrial society attempts to stimulate consumption through media and symbols to resolve overproduction issues, postmodern philosophers focus on the relationship between reality and media representation, launching strong critiques against the illusions constructed by symbols and media. Thus, Western scholars summarize postmodernity’s cultural trends or aesthetic movements as the “crisis of representation”.

Postmodern philosophers point out that reality has been replaced by illusions constructed by media representation or signification. In this regard, French thinker Guy Debord writes in “The Society of the Spectacle”: “In a society dominated by modern production conditions, life is presented as an accumulation of countless spectacles. Everything alive retreats into representation.” The “spectacle” or what postmodern theorists call “simulacra” is an illusion supported by appearances, obscuring the social existence itself. The Oxford Dictionary defines “simulacra” in two aspects: one is the image of things; the other is a deceptive substitute. French philosopher Jean Baudrillard views “simulacra” as “the elements of coding rules and the pseudo-phenomena of media technology operations”, a process where “fictional” or “mimetic” things replace the “real”. Entering the “simulation” stage, “simulacra” represent the massive copying and hyper-real existence without origins, object references, or foundations in postmodern society, including images, symbols, representations, or spectacles. It is precisely the “simulacra” that replace “real life” and the objects and realities existing in the real world.

After clarifying the academic divides regarding “what is reality” and “reality and media representation” between modernity and postmodernity, it becomes clearer to see: the modern view of reality is that “reality is knowable, certain, and singular”, and media can reproduce reality; while the postmodern view of reality is that “reality exists but is unknowable”, and media cannot reproduce reality. Specifically, the modern viewpoint is that technology, science, media, and symbols can reproduce reality. Indeed, traditional mass communication media are also products of modern industrial society, and their understanding of reality is also unique and certain, mainly reflected in the pursuit of singular, definite truths and facts through journalistic objectivity and the requirements of journalistic professionalism. However, postmodern philosophers attempt to use “simulacra” to explain reality and media representation and further reveal the rupture and transgression between “symbolic reality” and “objective reality” through the concept of “hyperreality”. The so-called “hyperreality” refers to a reality without prototypes and authenticity, a reality manufactured by a mold of a real thing, which is more real than reality, where reality is no longer something that can be reproduced, but always something that has already been reproduced, i.e., hyperreal. It is evident that for postmodern philosophers, the illusions constructed by symbolic reality and media representation have already surpassed and replaced reality itself. Mass communication media are not approaching objective reality; they are merely fabricating reality and creating illusions.

Understanding “reality and media representation” is more based on a postmodern perspective, particularly focusing on the relationship between “illusions and media representation”, that is, the issues of falsehood, erroneous generation, erroneous reproduction, and other illusions manufactured by media (especially generative AI). Currently, we are in the digital media era, where the world schema of human existence has undergone a fundamental transformation. Compared to the simulated images and illusions of traditional media eras, digital media may surpass them. The arrival of post-truth is sufficient to illustrate that the singular and definite truths and realities upheld by traditional news media institutions are being deconstructed by digital media and decentralized mass nodes. Reality is submerged by public emotions, scattered across various communication nodes in virtual spaces, with each individual perceiving and constructing subjective reality, where “reality” becomes an individual’s internal “truth”, fluctuating based on one’s emotions and sensations; what can be perceived is deemed “real”. Thus, reality becomes pluralistic, with illusions omnipresent, while the rationally singular reality becomes difficult to find. In other words, Baudrillard’s assertion of “hyperreality” is more evident in digital media. Standing at the critical point of the impending artificial intelligence era, the emergence of Sora as a technical image signifies that media can simulate the physical world, and generative AI will produce more media illusions, appearing in entirely new forms of illusions. In this regard, the first viewpoint of this article is that the illusion issues appearing in the media representation process are related to the inherent deceptive characteristics of media.

In fact, the birth and evolution of media are always inextricably intertwined with deception. Although in most cases, humans strive to identify and eliminate the falsehoods and illusions permeating media to represent facts and truths, it must be acknowledged that in the process of media constructing the real world, different degrees of intentional or unintentional deception always pervade. Certainly, media have a reproducing function, but “reproduction” represents a world of expression rather than a world of reality, a reflection, likeness, or replica of things in the “real” world. In the process of constructing and reproducing the world with symbolic reality, the reality between media and specific referents diverges, and even without intentional or malicious deception and misguidance, media technologies inevitably accompany various forms of illusions, including falsehoods, delusions, and hallucinations. French philosopher Jacques Derrida believes that it is difficult for us to distinguish between correct reproduction and erroneous reproduction, truth and falsehood, reality and fiction; in fact, reproduction is always accompanied by inevitable fictions and errors.

The phenomena of erroneous reproduction, delusions, concealments, and falsehoods appearing in media can all be summarized as illusions, which correspond to a result that is reflected in human senses. The so-called “illusions” refer to images produced in fantasies, hallucinations, or dreams; more profoundly, “illusions” are scenes of desire realization expressed in imagination, standing in opposition to “reality” as a pair of concepts, and illusions often arise in the process of media mediating reality. In traditional media studies, scholars often attribute the results of illusions caused by deception to human factors, i.e., media illusions represented by erroneous reproduction are caused by humans intentionally or unintentionally. First, humans recognize and understand the external world through their senses and bodies, but human senses and bodies have limitations, which often lead to humans falling into delusions and hallucinations. Therefore, “being able to be deceived” is an inherent physiological characteristic of human experience and perception of the world, which is also a prerequisite for illusions to pass unnoticed. That is, humans possess the ability to manipulate reproduction and create illusions using media technologies. Thus far, the study of “deception” in media history has mainly focused on it as an unusual situation, often emphasizing human capacity to manipulate media rather than acknowledging “deception” as a structural component within modern media. This results from a human-centered perspective that overlooks the fact that deception is a structural component of the interaction between humans and media technologies; as non-human actors, media also possess deceptive characteristics and the ability to create and manipulate illusions. In this regard, we choose to return to the origin of illusion manufacturing, taking a different approach by starting from the “deceptive” characteristics of media, seeking a reinterpretation of the illusion manufacturing process of media. The term “deception” implies a certain form of subjectivity and agency, involving not only humans and technology developers but also the media and technology themselves.

2. Media Deception and Illusion Manufacturing

Throughout human civilization, many media have an inherent deceptive nature. In the era of oral and written language, language is filled with lies, and written texts always contain false information, including the deceptive implications in paintings’ light and shadow. Entering the printing era, Gutenberg’s printing technology brought mechanical reproduction, leading to the “loss of spirit” in artworks, where genuine and replicated pieces appeared equivalently, making the deceptive characteristics of media evident. Since the rise of the industrial revolution, as various electronic media increasingly intervene in human daily life, the deceptive characteristics of media technologies have become even more pronounced. Compared to traditional oral, written, and printed systems, electronic media is a symbol system more adept at presenting “reality”; in the process of reproducing and constructing reality, electronic media invites human bodily senses to join, thus presenting “symbolic reality” in a more realistic form at the sensory level. Voice electronic media represented by phonographs, telephones, and radio influence listeners through symbolic reality, leading them to perceive symbolic reality as objective reality. One of the most representative events is the radio drama “War of the Worlds”. The live broadcast media technology directly caused listeners to confuse fiction with reality, indicating that broadcasting not only possesses media deception but also has a significant potential impact on human behavior. Meanwhile, imaging media technologies such as cameras and camcorders store, transmit, and present “reality” in frames, creating a sense of “reality” that traditional oral, written, and printed systems could not provide. The emergence of images seems to eliminate human subjective judgment and faithfully replicate the reality of the world, initially viewing imaging media as objective and neutral in communication; this undoubtedly reinforces the technicalist assertion that “seeing is believing”. However, when the essence of reality falls on the imaging level, the original reality becomes difficult to grasp, and the reality, facts, and truths obtained through imaging become a pseudo-proposition. Ultimately, seeing is not necessarily believing, and having images does not necessarily mean having truths; imaging media continuously create illusions, and their deception is inevitable.

Unlike audio and imaging media, audiovisual media represented by television and film simultaneously extend human auditory and visual senses. Consequently, human imagination is unprecedentedly stimulated, and the means of creating illusions become richer through audiovisual media technologies. The deceptive nature of media begins to manifest dynamically and visibly. Taking film as an example, the film medium has the ability to create illusions. Friedrich Kittler, when discussing film media, believed that “in the eyes of these deceived viewers, behind the silent film creators who never imagined fright or murder, film has transformed from its initial form into an illusionary medium.” In Kittler’s view, the “masters” skilled at creating and manipulating film illusions are not the Lumière brothers or Méliès. Indeed, he is not primarily concerned with the personal artistic styles of directors or the classic films they are proud of but with the material basis that constitutes “imaginative” and “illusory”, namely the media and technologies themselves, including specific film shooting techniques or skills such as layering, multiple exposures, reverse filming, and stop-motion shooting. All film media aesthetics originate from the subsequently standardized 24 frames per shot; stop-motion and montage, slow motion and time-lapse photography merely translate technology into the desires of the audience, as the illusions in our deceived eyes, editing reproduces the continuity and regularity of motion. Illusions flicker non-continuously in the so-called continuous time process. It is evident that in the process of film production, the deceptive nature of media and technology becomes an indispensable part of illusion creation, corresponding to a kind of artistic creativity, with the quality of film effects increasingly dependent on how much media technology can convincingly immerse humans in deception and illusion.

Entering the internet and digital technology era, digital media represented by computers, smartphones, and VR/AR transcend the dimensions of reality, causing human society to increasingly shift from the real to the virtual. Computer media, based on data, algorithms, and computer images, form the digital infrastructure, creating a parallel virtual world beyond the illusionary world inside Plato’s cave and the real world outside it. However, beneath the virtual facade, accompanied by the diversified development of deceptive technological means, media deception increasingly permeates human daily life, making illusions ubiquitous in various sensory experiences of humans. First, the evolution of digital media technology has led to significant technological changes in audiovisual media: moving images have transitioned from electronic to digital, allowing images to shift from electronic flows to binary codes. With the aid of digital virtual technologies and the interactive systems of digital media interfaces, images have exited the physical boundaries of cinemas and living rooms, entering mobile and interactive digital media like smartphones, tablets, computers, and VR/AR, evolving into a new form of imagery: short videos. At the same time, the decentralized production model of digital media has led to an explosion in the quantity of short videos, making imagery ubiquitous. However, through the fingertip practices of interactive interfaces, software applications such as splicing, editing, and retouching, aided by techniques of misdirection, the imagery produced in short video production becomes more deceptive and misleading, with the illusions created being fantastical and varied. Thus, digital media imagery not only perfectly inherits the ability of film and television to create illusions but also pushes the deceptive means of media to their extremes, resulting in a situation where illusions reach a level of “indistinguishable from reality”.

Not only does the illusion of imagery production further advance in digital media, but another more covert and decisive deceptive technical system is hidden behind the interactive interface, determining the foreground representation of digital media imagery and factual truths, namely the data and automated classification and sorting algorithm mechanisms. The “data” acquired, processed, configured, and reproduced by computer systems have become a “source of facts”, even replacing and eroding the facts themselves. The era of facts has been terminated by digital media and the internet; the position once occupied by “facts” is now being replaced by “data”.

Simultaneously, the “software agents” of “automated intermediaries” achieve large-scale information processing resource allocation, largely without direct human intervention, meaning that data collection, classification, and sorting are almost entirely outsourced to machines. Undeniably, data does not always represent reality, and algorithms and their weights also imply value biases. Within a series of abstract mechanisms and invisible algorithmic black boxes, the deceptions caused by falsehoods and biases are wrapped in the guise of technological neutrality, thus the data and algorithm mechanisms behind computers and other digital media become the root causes of post-truth, constructing a new illusion of facts. Overall, through the path of “binary code – interactive interface – image illusion”, as a meta-technology, digital media can unify and transcend the deceptive characteristics of previous media, making image illusions omnipresent and timeless. On this basis, new deceptive technological means have emerged in digital media, forming new paths for illusion manufacturing based on data and algorithms: “data/algorithm – interactive interface – factual illusion”, where facts become random, diffuse, uncertain, and transient, evolving into a new illusion called “post-truth”.

Traversing the development of deceptive media, media history has revealed that different media employ deception as a strategy to achieve specific media communication effects. In fact, since the inception of media, deception has begun to manifest as inherent characteristics of certain media, lurking in the technical systems and operational mechanisms of media. However, due to the many forms of illusions generated by media technologies, there is no clear boundary between them and reality; they are more often latent, diffuse, commonplace, and difficult to detect, which is why in certain specific situations, they are not regarded as deception. It is evident that, unlike the errors and illusions deliberately and maliciously created by humans in media representation, the deceptive nature of media such as broadcasting, cameras, films, televisions, computers, and VR/AR is not initially negative or malicious, and humans always actively and willingly fall into various dazzling deceptions. In light of the above circumstances, this article attempts to distill the characteristics of “media deception” (deceitfulness of media), namely that “media are always embedded with deceptive technological means, making deception an inherent trait of media itself”. Specifically, it is precisely because of the existence of deceptive characteristics in media that they continually create illusions in the process of mediating reality, perpetually unable to replicate reality, involving media using symbols, binary codes, data, and algorithms to produce, manufacture, and convey illusions to audiences or users; on the other hand, media create various deceptive means and mechanisms through technology and equipment, inviting humans into sensory experiences constructed by different media through the manufacture of illusions and creating autonomous illusions for humans, thus quickly integrating media into human daily life.

Certainly, “deception” in the Chinese context is a term filled with negative connotations, often regarded as immoral behavior. However, this article insists on using the term “deception” to explain the origins of illusions such as erroneous reproduction, delusions, and falsehoods manufactured by media, primarily based on the following considerations: “deception” itself exists on two different levels: one is the formal level of deception; the deceptive nature is an important characteristic of species existing in the world, corresponding to a survival skill. For instance, humans inherently possess deceptive traits, capable of lying and concealing; there are no absolutely honest humans. Similarly, animals deceive predators or prey through behaviors such as playing dead, mimicry, and diversion, to gain more living space, following the natural law of “survival of the fittest”. This form of deception is an innate ability of species, involving little moral criticism, while the deceptive nature of media also primarily belongs to the formal level of deception, enabling media to be more widely accepted and used, which is the adjustment of media to humans and the law that allows media to survive for a long time. The second level is the normative level, involving malicious deception such as fraud and scams, which must be subjected to moral criticism and legal constraints. Obviously, the deceptive characteristics of media lean more towards the formal level of deception, but this does not mean that the deceptive nature of media is always harmless. When power structures reside in media, accompanied by deep human involvement, the deceptive nature of media may shift from “formal deception” to “normative deception”, ultimately having a profound impact on society. From this perspective, using the term “deception” can accurately present and elucidate this characteristic of media, facilitating further exploration of the cultural consequences of media illusions triggered by media deception.

3. The Deceptive Nature of Artificial Intelligence and Generative AI Illusions

Following the trajectory of media deception history, if we observe the new media “generative AI” through the lens of the characteristic of “media deception”, we can find that generative AI also possesses deception, which is a continuation and inheritance of deceptive practices in other media technologies, and in the manufacturing of falsehoods, delusions, and hallucinations, generative AI has surpassed all previous media. Tracing back to the starting point of conceptualizing artificial intelligence, the “Turing Test”, the deception of generative AI begins to become traceable. As early as 1936, British mathematician Alan Turing proposed a method to test “whether a computer possesses intelligence” in his book “Computing Machinery and Intelligence”, namely the “Turing Test”. The “Turing Test” adopts a question-and-answer format; if a computer can imitate human responses so well that the tester mistakenly believes it is human rather than a machine during sufficient interaction, then the computer is said to be capable of thinking and possessing intelligence. From the rules set by Turing, “deception” becomes an inevitable result of human-machine interaction. In human-machine dialogue, only by successfully deceiving the human interrogator can the machine win the “game”. In other words, the essence of the “Turing Test” is a kind of “deception game”, which initially requires artificial intelligence to possess the ability to deceive humans. Therefore, with such a premise and testing standard, “deception” becomes central to the operation of generative AI, just like circuits, software, and data, being a basic condition for the functioning of generative AI. Ultimately, “deception” is deeply embedded in all media practices of generative AI.

In this deception game, “how to behave more like a human” becomes the ultimate goal of generative AI. Interestingly, generative AI not only learns the ability of humans to use language to speak and answer questions but also inherits human flaws; however, this does not mean that generative AI is not intelligent. Rather, it indicates that generative AI is getting closer to its ultimate goal. Specifically, when faced with questions it cannot answer or does not know, generative AI, represented by ChatGPT, will “lie”, producing text in the process that is “seriously nonsense”. In this regard, the AI research community refers to this phenomenon as “AI hallucination”, which specifically refers to the large language model (LLM) generating content that appears reasonable or consistent with real inferences but is erroneous, such as incorrect references, content, and statements, which may intertwine with correct information and present in a persuasive and credible manner, making it difficult for humans to identify without careful checking and fact verification. AI hallucinations include two categories: intrinsic hallucinations, which refer to outputs generated by AI that contradict input content, easily identifiable through content verification; and extrinsic hallucinations, which refer to the correctness of generated content that cannot be verified from input content. This type of content is usually produced when the model calls data or texts beyond the input content, often containing biased statements and attitudes, which are more common and harder to identify. In the media practices of generative AI, “AI hallucinations” will remain a technological challenge for a long time, marking that generative AI has thoroughly integrated hidden deceptive techniques into every aspect of OpenAI’s text production, circulation, and consumption. Within the scope covered by generative AI applications and practices, the discernment of facts and hallucinations is no longer the responsibility of intermediary institutions but has been delegated to each individual user who interacts with it, making the differentiation between true and false become threshold-laden. Ultimately, lies and deception will be ubiquitous, with “AI hallucination” becoming the first type of illusion produced by generative AI.

Not only does ChatGPT exhibit deceptive “AI hallucinations”, but other media practices surrounding generative AI also carry deception and continuously produce illusions in various forms. Building upon text generation, generative AI has iterated to produce text-to-image and text-to-video large models, with applications such as Midjourney and Sora. These large models can deeply understand textual language, automatically generating realistic images or videos based solely on text and keywords provided by users, possessing the ability to emerge and simulate the real physical world. The text-to-video model can even create dynamic simulated images that are “real” and conform to physical reality, referred to as a “world simulator”. In fact, electronic images and digital images are representations based on the real world; the automated images and videos generated by generative AI are merely simulations rather than reproductions of reality. The images automatically generated by the “world simulator” are essentially “real fakes” or “real illusions”; therefore, the images produced by generative AI itself are a form of “illusion”. Of course, the “AI hallucinations” that appear in ChatGPT are likewise unresolved in Sora, and the large model for text-to-video generation still produces some factual errors; it cannot accurately simulate many basic physical interactions, leading to bloopers and visual errors in videos. Moreover, due to the overly realistic simulated images, which can reach a level of indistinguishability, unless the AI source is marked in the generated video or unless one scrutinizes the image details, it becomes challenging to identify machine-generated images, which further exacerbates another form of illusion manufacturing, namely “deep fakes”. The so-called “deep fakes” refer to the use of powerful AI technologies such as deep learning to manipulate or generate more deceptive visual and audio content. With the involvement of generative AI, deep fakes will become even harder to identify. Thus, the various erroneous reproductions, erroneous generations, and false representations appearing in generative AI can be collectively referred to as “generative AI illusions”.

4. Automated Production, Interactive Interfaces, and Generative AI Illusions

Where do the deceptions and illusions of generative AI come from? In fact, the deceptive characteristics of media are endowed by specific technologies, and thus media illusions stem from media technologies. Based on the differences in media practices and the materiality and functionality of technologies, ultimately, different deceptive media will also produce different forms of illusions. Specifically, in the electronic media era, broadcasting transmits sound-form programs to audiences through media objects such as electromagnetic waves or wires, thus introducing illusions into the human auditory system. Since the invention of the camera, the illusion manufacturing of media has entered the human visual system. The camera utilizes optical imaging principles to form images and uses film to record these images, making replication of reality the most important function of the camera. However, this replication is fragmented and one-sided; different shooting angles and methods may also cause deception and generate illusions. Based on static images, film and television, through projectors, speakers, and screens, along with specific operational techniques such as shooting angles and editing, generate not only continuous dynamic images but also non-continuous illusions. In the digital media era, computers and smartphones are embedded with various codes and algorithms, forming hidden software systems. Through invisible algorithmic automated devices, all deceptive mechanisms are concealed behind interactive interfaces, with illusions vividly appearing on the interface. Overall, the deceptive nature and illusion manufacturing of media in the electronic media era primarily rely on the material foundation of media and specific operational techniques and operational mechanisms. In contrast, digital media possesses a more complex software system, with both hardware and software systems collectively determining its means of producing illusions.

Similarly, entering the artificial intelligence era, the emergence of generative AI illusions such as “AI hallucinations”, “simulated images”, and “deep fakes” is not coincidental. The technologies constituting the deception of generative AI involve two sets of systems: software systems achieving automated production based on big data, multimodal large models, natural language processing, and neural network algorithms, as well as hardware devices such as interactive interfaces. Focusing on the massive and complex software systems of generative AI, from the perspective of technological logic and operational mechanisms, whether it is text-to-text, text-to-image, or text-to-video, generative AI is primarily based on understanding human meta-language. In human-machine interactions, if there is an obstacle in understanding meta-language, it will lead to communication failure. Based on vast textual raw data, various forms of text codes, mathematics, and natural languages are unified through strings (tokens); subsequently, combining multimodal large models, optimizing neural network algorithms, and utilizing powerful computing power, deep learning methods are employed to train generative AI. During this process, human participation in the supervisory training of generative AI large models involves setting punishment and reward mechanisms for string text outputs through “manual data labeling” work, thereby optimizing the pre-trained language model. Ultimately, the natural language processing large model represented by ChatGPT possesses the ability to simulate and understand human meanings and intentions, successfully achieving human-machine dialogue. Meanwhile, the large vision model (LVM) represented by Sora disassembles videos and images into minimal data units through patches (which are scalable and effective representations for training various types of video and image models, corresponding to the tokens of ChatGPT), achieving similar effects through the aforementioned large model training system.

However, the authenticity of the raw data sources used to train generative AI cannot be fundamentally guaranteed; even if the dataset comes from relatively authoritative websites, institutions, or think tanks, factual errors and structural biases are still difficult to avoid. Factual errors can be resolved through fact-checking, but they cannot be completely eliminated. In contrast, structural errors are more hidden and difficult to perceive; for example, because the English materials in ChatGPT’s data source account for the highest proportion, the large model will exhibit structural bias towards the world constructed in English. Therefore, directly feeding factual errors or structural biases to the large model will inevitably lead to “AI hallucinations” and even produce outputs with implicit discriminatory or biased content. It is unrealistic to expect that limited manual data labeling work can entirely eliminate the errors in vast datasets and the resulting AI hallucinations; even supervised training may mislead the large model, as humans also possess structural biases. In short, the original datasets for training large models lay the groundwork for generative AI to produce illusions such as “AI hallucinations”.

Based on string and patch text encoding, image encoding, and machine deep learning, the automated production of generative AI becomes “intelligent”. However, this is merely a “game of probability calculation combinations” surrounding text, images, and videos, as well as a “simulation game” based on human language and the physical world.

First, regarding the large language model represented by ChatGPT, GPT (Generative Pretrained Transformer) is a natural language processing model based on the Transformer network. Compared to other deep learning neural networks, through a multi-layer self-attention mechanism and feed-forward neural network, GPT can calculate the associations between various sentences and positions in sequences and capture contextual information. Ultimately, through deep learning on massive textual data, generative AI can calculate and predict the next word, sentence, and paragraph based on a sequence of words, ensuring the relevance or causal relationship of contextual connections, thus simulating human language abilities. In simple terms, the logic of content production in generative AI is to fill in the blanks with the statistically highest probability words; it fundamentally cannot truly understand the meanings between words and sentences, let alone the causal relationships within them. Therefore, large language models are merely a “word game” based on simulation and probability. The software system of generative AI is more adept at generating answers that superficially resemble real things and the physical world, but its responses can easily break from factual content, leading to inevitable erroneous reproductions and illusions. Moreover, due to possessing extensive computational parameters and powerful computing power, the knowledge of generative AI surpasses that of any individual. When users lack sufficient knowledge in a relevant field, they can be deceived by the generated erroneous content and believe in these seemingly persuasive outputs. However, for experts in specific fields, the responses provided by generative AI are often filled with absurdities.

Secondly, the large vision model represented by Sora still corresponds to a probabilistic model in its underlying logic. The training of Sora is processed through a Diffusion Transformer architecture, capable of handling the latent codes of images and videos while processing temporal interactions in causal relationships. Borrowing from the idea of converting text into strings in large language models, Sora generates diverse image and video content through patches. Therefore, compared to large language models, large vision models merely expand the associations and causalities between words and sentences during the machine deep learning process, broadening them to calculate the relationships and representations of physical laws and the physical world, learning the transformation laws and representations at pixel-level, physical, and temporal dimensions of video frames. Similarly, large vision models can understand human language. Sora integrates the re-labeling technology of GPT and DALL·E 3 (text prompts generating images) by training a highly descriptive title model, establishing associations between language and video or image. Ultimately, in the human-machine dialogue box, generative AI conducts real-time recursive judgments and estimates based on the prompts input by humans, automatically generating structured images or video products with scene, color, time-space, and action information based on probability, achieving deep simulation of the physical world. However, generative AI’s understanding of the physical world and the external world does not form a continuous and complete system, nor does it possess the ability to genuinely reason the relationships between different concepts and entities. Therefore, large vision models’ understanding of the physical, causal relationships and the permanence of objects in the real world always exhibits fractures, and the output content, like that of large language models, is often untenable. More profoundly, the automated production software systems based on probability and simulation merely render generative AI more “anthropomorphic”, prioritizing to please human feedback rather than generating content based on factual logic, making it easy for humans to fall into the interactive traps set by the software system. “Facts yield to interaction” foreshadows that “content” will become extremely cheap, while “reality” will become a scarce resource.

Focusing on the hardware systems of generative AI, human dialogue and interaction with generative AI are realized through interactive interfaces. In technical terms, the interface is where the hardware and software of computers interact internally or with each other. However, it is often described as the device or program that enables computers to interact with users. In fact, since the advent of digital media, interactive interfaces have been highly illusionary. As a deceptive technology that creates continuity and visibility, the interactive interface conceals the deception system of generative AI behind it, mesmerizing users with a layer of illusion, making it so that users cannot feel the “friction” with the generative AI software system, rendering all deceptive mechanisms transparent. Thus, based on the foundation of digital media interactive computing systems, the interactive interfaces that extend the development of generative AI do not eliminate deception but integrate it into the design and system of the interactive interface, coexisting with deception. In the human-machine interactive system based on the interface, the technological object self-generates while also generating an associative environment. On the surface, users manipulate generative AI through the interactive interface, but in reality, they are also integrated into its software system, ultimately producing autonomous “illusions” for users. Overall, the automated production software systems based on probability and simulation construct a complex and massive deception mechanism for generative AI, with the material attributes of interactive interfaces concealing this deception beneath. Thus, the software and hardware systems of generative AI become the roots of various AI illusions produced.

5. Generative AI Illusions and Hyperreal Construction

Baudrillard divides the “world of simulacra” into four levels or sequences: counterfeiting, production, simulation, and fractal. The first sequence of simulacra spans from the Renaissance to the industrial revolution, characterized by “counterfeiting”, adhering to the natural value law, i.e., “use value”. For example, copies of genuine artworks during that time were solely achieved through manual imitation, where genuine pieces had a unique object identity, and counterfeits beyond the original were deemed forgeries, perceptible through human experience as subtle distinctions between objects and symbols, reality and illusions. In the era of oral and written language, the illusions produced by media belong to the first sequence of simulacra, reflecting some basic reality. The second sequence of simulacra pertains to the industrial age, characterized by “production”, adhering to the commodity value law, i.e., “exchange value”. Mechanical reproduction technology became a new method or medium for artistic production, leading to the undervaluation of artworks since they could be infinitely reproduced, ultimately resulting in the loss of spirit, with the subordinate relationship between originals and forgeries being inverted, and objects and symbols becoming equivalent. The illusions created by printing and electronic media correspond to the second sequence of simulacra, where these illusions begin to obscure and distort some basic realities. The third sequence of simulacra describes the current era supported by binary code systems, where the most significant characteristic of modern symbolic production is “simulation”, adhering to the structural value law, i.e., “symbolic value”. “Simulacra” have existed since ancient times, while “simulation” is a product of postmodern society. The so-called “simulation” refers to a symbolic production and behavioral process that is not based on objective reality but is extremely real. The “simulation stage” implies that objects are no longer merely replicated mechanically; with the aid of binary codes and algorithms, objects can be simulated and generated by computers, i.e., binary codes are producing genuine items, where objects begin to be replaced by symbols. Currently, the illusions produced by internet media correspond to the third sequence of simulacra, attempting to obscure the absence of some basic facts through a series of technological means.

The fourth sequence is proposed by Baudrillard in his book “The Transparency of Evil” as a supplement to the previous three sequences of simulacra. He believes that the order of simulacra has entered a new stage of “fractal value”, characterized by “fractals”. In contrast to the other three sequences, the definition of the fourth sequence of simulacra is not as clear. However, based on Baudrillard’s discussion of the fractal stage of simulacra, the characteristics of the “fractal stage of simulacra” can be summarized with several keywords: infinite self-proliferation, disappearance of reality, dispersion of value, and termination of difference. Specifically, infinite self-proliferation is the starting point for simulacra to transition to fractals, and it is also the greatest distinguishing feature of fractals. Baudrillard likens the simulacra of the fractal stage to cell division or viral transmission, representing a contiguous, metonymic, and asexual self-proliferation, where “things proliferate, infect, saturate, or become transparent, deplete, or extinguish through simulation, transforming into secondary existence through simulation. The result is not a fateful disappearance but a fractal dispersion.” In other words, while the simulacra of the simulation stage can produce genuine items, this production is limited and has not yet formed an unlimited self-proliferation mode like fractals. If the binary codes and algorithms of the simulation stage are likened to “black boxes”, then the fractal order represents these “black boxes” continuously self-replicating, representing a continuous proliferation of simulation.

In the fractal stage, codes endlessly split, regenerate, and diffuse according to the internal logic of the fractal system; objects are completely replaced by codes, and what the objects are becomes unimportant, merely serving as nourishment fed to the code system, enabling it to replace objects more perfectly. Consequently, the disappearance of reality follows, and nothing can be genuinely reflected anymore. The disappearance of reality directly leads to the dispersion of value. The logic of viral dissemination across various networks no longer adheres to the so-called value laws; value laws cease to exist, thus there is no logic concerning equivalence. What exists is merely a certain value’s popularity or transformation, and the random increase and diffusion of value, “every value or fragment of value shines for a moment in the simulated sky, drawing a zigzag line that hardly intersects with others before vanishing into the void. This is the fractal mode, and it represents the form our culture currently possesses.” Finally, fractal culture signifies the termination of difference: all systems and individuals are diffusing everywhere; to achieve the maximum stage of universalization, they must shed their essence and heterogeneity, absorbing into other categories to complete the unification from the particular to the universal. The ultimate result of fractal culture leads us into an era of super politics, super gender, and super aesthetics; however, at the same time, when everything is political, sexual, and aesthetic, it means that nothing is political, sexual, and aesthetic. In this existence of mutual infiltration among all individuals, things, levels, and systems, a primary force is media and more universally, general communication.

Although the fourth sequence of simulacra may be controversial in its application to the entire society, it possesses certain explanatory power concerning the AI illusions produced by artificial intelligence media. Similar to digital media, the “black box” of generative AI is based on binary codes and algorithms, but the complexity of the artificial intelligence “black box” completely surpasses that of traditional digital media, and the “black box” of generative AI has evolved into an automated production mode. Supported by trillions of parameters, through multimodal large models, natural language processing, and neural network deep learning, the “black box” of generative AI is achieving infinite generation through automated production. It can continuously provide answers based on human inquiries, corresponding to the self-replication and infinite proliferation described in the fractal stage. Generative AI attempts to absorb the reality portrayed by massive big data to achieve a state of silent proliferation and diffusion similar to the real world. However, the result is often chaotic and unpredictable, with the emergence of generative AI illusions corresponding to this outcome. In fact, the software and hardware systems of generative AI have determined its inability to accurately answer questions; facts and reality do not matter to it; what is more important is to achieve continuous human-machine dialogue and interaction, enabling generative AI to maintain a state of “infinite self-proliferation”. If it merely repeats answers of “I don’t know”, or directly reads content from databases, then it is fundamentally no different from a search engine. Therefore, the second characteristic of the fractal stage, the “disappearance of reality”, is also reflected in the automated production process of generative AI.

Clearly, generative AI illusions have gradually become a new cultural phenomenon, representing a manifestation of contemporary “fractal culture”, signifying the “dispersion of value” and the “termination of difference”. In the fractal culture of generative AI, there are no longer binary references such as true and false, beauty and ugliness, good and evil; nothing can be located through the reference system of horizontal and vertical coordinates. Values point in all directions and fill every gap, and values no longer require any reference, merely needing to obey adjacency and diffuse in a viral manner through splitting and proliferation. First and foremost, this leads to the termination of the difference between “truth and falsehood”, or in other words, the distinction between truth and falsehood becomes unimportant and meaningless in the context of generative AI, where the greatest charm of content generation lies in its ability to “manufacture illusions”, which corresponds to the creativity and imagination of machines. Thus, “manufacturing illusions” is an important component of generative AI media practices. The program setting of “generating content rather than reality” has inevitably led to the outcome of “reality yielding to illusion”. Within the generative AI illusions, “AI hallucinations” imply that facts are untenable; the emergence of Sora represents a transition from the “visible world” to the “simulated and manufactured world”, where the images in the “simulated images” are all generated simulations without real sources. The “simulated images” directly and thoroughly eliminate reality, while “AI deep fakes” further manufacture falsehoods. Thus, the illusions produced by generative AI itself are “pure simulacra”, forming a “fractal culture” based on probability and simulation.

Baudrillard did not provide a definitive argument regarding the relationship between the fourth sequence of “simulacra” and “hyperreality”. However, we can first return to the context of simulation culture to understand the concept of “hyperreality” itself, and then explore the “hyperreality” exhibited by the “fractal culture” in generative AI. When discussing the consequences of simulation culture, Baudrillard asserts that simulation produces a pervasive “hyperreal” illusion, where “the distinction between real and non-real becomes blurry, and the non-real exceeds the real, being more real than reality”. This is the characteristic of simulation, forming a new order of reality composed of many simulacra. At this point, reality is no longer a natural entity like a landscape or ocean; it also involves the “real” produced or reproduced by humans, while reality itself remains silent within hyperreality. Entering the fractal stage of simulacra, “hyperreality” has never disappeared; it is also a result and state of fractal culture, and compared to the simulation stage, the “hyperreality” of the fractal stage will proceed further and more thoroughly, potentially leading to the termination of the difference between true and false. Certainly, mass media and digital media belong to the stage of simulation culture; the illusions they create will similarly encounter hyperreality issues, such as films in mass communication media being viewed as “illusionary media”, especially in science fiction films, where hyperreality is vividly portrayed; while the post-truth of digital media is also a form of “hyperreality”.

The fundamental distinction of generative AI illusions in the fractal stage compared to the former is that generative AI will completely flatten the differences between “truth and falsehood”, rendering the distinction between true and false meaningless. In fact, in the media environments created by mass communication and digital media, we can always sense the conflict and struggle between “reality” and “falsehood”. Specifically, the truths and falsehoods of mass media are manipulated by a few elite groups. However, as traditional media news organizations monopolize discourse power, the singular, rational, and definite truths can be known by the public, and reality superficially triumphs over illusion, leaving a boundary between truth and falsehood. The emergence of digital media has reversed this situation; as countless masses flood into communication networks, reality becomes highly uncertain, and the singular reality emphasized during the mass media era is submerged and devoured by the cacophony of illusions, with post-truth overshadowing truth, meaning that illusions become more prevalent than reality, allowing truth and falsehood to coexist. Generative AI takes this further; in the illusion universe constructed by generative AI, whether the content is true or false no longer matters; what is more important is “the various media practices surrounding generative AI and the new cultural forms constructed upon them”, which may lead us into a media environment and new cultural forms that “relentlessly produce falsehoods and illusions”. As generative AI increasingly intervenes in human daily life, humans find themselves wrapped in the illusion universe constructed by generative AI, unable to distinguish between true and false, and indifferent to the distinction. The phenomena of “reversal of truth and falsehood” and “inversion of truth and falsehood” will become more prevalent than the post-truth of the digital media era, ultimately resulting in the outcome that “when falsehood is treated as truth, truth also becomes falsehood”.

The second distinction of generative AI illusions compared to mass media and digital media illusions is that the content production subject of generative AI is machines, and the production mechanism is based on the infinite self-value of codes, data, and natural language. On one hand, machine content generation implies that the efficiency of illusion manufacturing is higher and costs lower; in specific quantitative terms, the generation and manufacturing of illusions will enter an unprecedented stage, with humans increasingly wrapped in generative AI illusions. On the other hand, regarding the production mechanism, when computer simulation systems based on codes and algorithms transform into artificial intelligence systems with stronger data, computing power, and algorithms, the automated production based on probability and simulation of the real world becomes increasingly meticulous. The “authenticity” that serves as the experiential foundation of the overall media perspective is thoroughly dissolved; not only is reality being simulated, but “hyperreality” itself is also infinitely simulated, proliferated, and diffused. Ultimately, the illusions of generative AI break the illusion of “coexistence of truth and falsehood” in digital media, exhibiting characteristics of “hyperreality” such as “confusing truth with falsehood” and “creating something from nothing”.

Built upon the termination of the difference between “truth and falsehood”, a more dire consequence in the fractal culture of generative AI is the emergence of the termination of the difference between “good and evil”. When this article chooses to use “deceptive nature” to summarize the general characteristics of media, it has mentioned that “the deceptive nature of media itself belongs to the formal level of deception, but it harbors the seeds of normative deception”. In fact, concerning formal-level deception, the “AI hallucinations”, “simulated images”, and “deep fakes” produced by generative AI correspond to its “creativity” and “imagination”. When humans face questions they do not know, they will mobilize their existing knowledge systems to attempt to answer; generative AI does likewise, and its charm lies precisely in this “AI illusion”. However, this article does not intend to negate the generative AI illusions from the root or arbitrarily categorize them as negative phenomena, as this contradicts the neutral nature of media deception emphasized repeatedly in the definition of media deception. Yet, this does not mean that generative AI illusions can be manufactured without restraint; we must also recognize the potential risks they bring. What needs to be cautioned is that when humans and power structures intervene in the manufacturing of generative AI illusions, the deception system of generative AI may exacerbate wrongdoing, and “generative AI illusions” will shift from “formal-level deception” to “normative-level deception”, becoming highly controversial in moral and legal terms. The most direct example is “deep fakes”.

Unlike traditional digital media’s “PS technology”, which relies on human-provided materials, deep fakes achieve “creating something from nothing” based on “autonomous forgery”, and there exists a risk of abusing personal biometric information such as human faces, voices, blood types, fingerprints, and DNA. Dynamic forged images exacerbate the prevalence of intentional and malicious deception, posing a significant threat to human society. At this point, the disappearance of reality, malicious deception, and the black-and-white distinction of “good and evil” all lie within the minds of humans and machines, potentially triggering a comprehensive collapse of social trust relations. Even if text, images, and videos are all real, they may provoke human suspicion regarding their authenticity, as everything appearing to be real implies that nothing is genuinely real. The once unified “reality” and “trust” now begin to separate; what is real is no longer important; what matters is what I want and what I choose to believe. This also warns us that the potential for technological evil has been unleashed; the deceptive nature of generative AI is likely to be maliciously exploited and taken to extremes. While we embrace the creativity and imagination of generative AI illusions, we must also expose those inappropriate and malicious applications of generative AI to resist deep control over us by the public or private entities through data and algorithms.

Conclusion

Throughout human civilization, many media possess an inherent deceptive nature, which is an inherent characteristic of media technologies. However, in previous studies concerning “truth and media representation”, the illusions and various deceptive results appearing in media representation have often been viewed as human-made, meaning that humans use technologies to create illusions, and deception is merely a human behavior. Media and technology, in this process, maintain a neutral state; this is a human-centered conclusion that often overlooks the agency of media itself. Currently, generative AI undoubtedly serves as a non-human actor with agency, and the emergence of generative AI illusions precisely reminds us that we should transcend the human-centered perspective, rethinking the relationship between reality, illusions, and media representation, thereby deeply exploring the new cultural consequences brought about by the changes in new media. In fact, the illusions appearing in the media representation process are not solely caused by humans but result from the interplay with the inherent deceptive characteristics of media itself, leading to the continuous creation of illusions in the process of mediating reality. As a new type of media, generative AI also possesses deception, and in the manufacturing of illusions, generative AI has surpassed all previous media, specifically manifested in forms such as “AI hallucinations”, “simulated images”, and “deep fakes”.

From a diachronic perspective, each leap in technology brings an upgrade in illusion manufacturing. Compared to previous electronic and digital media, the software systems of generative AI make deception mechanisms and illusion manufacturing more complex. In the future, generative AI is bound to move towards holography. As more human senses are invited to join, the illusions produced by generative AI will not merely be “seeing is not necessarily believing and hearing is not necessarily true”; humans will face comprehensive, three-dimensional sensory deceptions. Consequently, generative AI constructs a “hyperreal” world, and the illusions of generative AI form a new cultural phenomenon corresponding to Baudrillard’s prophecy of “fractal culture”. The ultimate cultural consequence is the dispersion of value, where “truth and falsehood” become unimportant, leading to the termination of the distinction between “good and evil”. The generative AI illusions will be filled with cases of “confusing truth with falsehood” and “inversion of truth and falsehood”; in summary: when falsehood is treated as truth, truth also becomes falsehood.

The potential for technological evil has been unprecedentedly unleashed, with generative AI deep fakes posing risks of abusing individual biometric information. When distinguishing between true and false becomes increasingly difficult, when the evils of technology and human malice converge, society is about to face a comprehensive trust crisis. How to break through malicious illusions in generative AI, recognize reality, and how to restrain technological evil to achieve good governance of technology will become major propositions in the era of artificial intelligence.

(Editor: Xie Huicong)

To adapt to WeChat formatting, the annotations and references in the main text have been deleted; please forgive us! For the full text, please click the lower left corner “Read the original text” to obtain!

Layout: Qiu Haiyan

Initial Review: Xin Daleng

Re-review: Zhang Dengde

Final Review: Zhang Guanwen

Recommended Reading

“Shandong Normal University Journal (Social Sciences Edition)” 2024 Key Topics

“Shandong Normal University Journal (Social Sciences Edition)” 2024 Issue 1 Directory

“Shandong Normal University Journal (Social Sciences Edition)” 2024 Issue 2 Directory

“Shandong Normal University Journal (Social Sciences Edition)” 2024 Issue 3 Directory

“Shandong Normal University Journal (Social Sciences Edition)” 2024 Issue 4 Directory

“Shandong Normal University Journal (Social Sciences Edition)” 2024 Issue 5 Directory

“Shandong Normal University Journal (Social Sciences Edition)” 2023 Total Directory

“Shandong Normal University Journal (Social Sciences Edition)” 2022 Electronic Journal

“Shandong Normal University Journal (Social Sciences Edition)” 2023 Electronic Journal

Journal Public Account

The Logic of Generative AI Illusions and Its Cultural Consequences

Journal Website

The Logic of Generative AI Illusions and Its Cultural Consequences

Shandong Normal University Journal (Social Sciences Edition)

Phone|0531-86181600

 Postcode|250358

Address|Shandong Normal University Changqing Lake Campus

(No. 1 University Science Park, University Road, Changqing District, Jinan, Shandong)

Website|www.jstu.sdnu.edu.cn

Email|[email protected]

Leave a Comment