Abstract In the practice of international communication, generative artificial intelligence, represented by Sora, has become a new technological actor. While continuously cultivating new communication models and forms, it also promotes the evolution of political, economic, cultural, and other communication mechanisms. Specifically, artificial intelligence can enhance the adaptability and precision of international communication as a foundation for big data analysis, and through the combination of multimodal content generation and virtual reality technology, create a cross-cultural “third space” that releases cultural flows based on empathetic experiences. However, as artificial intelligence changes the ways information is produced, disseminated, and perceived, ethical issues related to “deep fakes” and the “knowledge interruption” of human-created content have emerged, while the narrowness of algorithm training data can exacerbate the spread of extreme ideologies and stereotypes. Therefore, in the future, we must further expand the participation of diverse global subjects in co-governance, break the monopolies of value culture and technology design, and integrate technological evolution into the action framework of global human civilization exchanges and mutual learning.
Keywords: International Communication, Artificial Intelligence, Technological Culture, Sora
1. Introduction: The Role of AI in International Communication
On April 27, 2024, Tsinghua University, in collaboration with Beijing Shenshu Technology Co., Ltd., officially released China’s domestically produced text-to-video large model—Vidu. Based on the long-term accumulation of the research team in Bayesian machine learning and multimodal large models, it can generate a 16-second video at once, presenting complex dynamics such as long shots, focus tracking, and transitions in visual effects. More importantly, as a domestically produced large model, Vidu can better understand and generate unique Chinese elements such as pandas and dragons, breaking the limitations of Western-centric aesthetics. As a cutting-edge technology of generative artificial intelligence, the text-to-video model is updating our imagination of global communication at an incredible speed.
In the digital age, technological innovation often spreads exponentially on a global scale, with generative artificial intelligence being a typical representative of this innovation diffusion mechanism. Large language models (LLMs) represented by ChatGPT are successfully applied in the fields of information production and dissemination, attracting governments, high-tech enterprises, investment institutions, and media organizations from various countries to actively engage in the iterative upgrade of generative AI and its supporting industries, continuously exploring the practical value of artificial intelligence in promoting productivity growth and market consumption upgrades. Technological philosopher Gilbert Simondon believes that technological objects are open systems that tend to stabilize through the continuous assembly (assemble) of external elements and individuals. In the field of international communication, generative artificial intelligence has become a key actor in reshaping the associated technological, cultural, economic, and political environments, continuously attracting different content resources and channel methods to bind with it. This trend stimulates our imagination of new means, interfaces, and application scenarios for international communication, calling for the exploration of more diverse and efficient communication forms and products. Therefore, international communication urgently needs to explore new connotations and expressive spaces in interaction with artificial intelligence technology to better adapt to and lead the development changes of the global digital society.
However, at the same time, the development of generative artificial intelligence has also led to the “automation dilemma,” where machine logic gradually replaces humanism as the dominant epistemology of future information civilization. This may reinforce the existing conflict structures in the field of international communication and create barriers for cross-cultural exchanges. The powerful ability of artificial intelligence to generate images, audio, or video through deep learning technology prompts us to continuously examine the question of “what is real.” As a media actor that can significantly influence the ways information is produced, circulated, and perceived, generative artificial intelligence is defining the existence and action mechanisms of national narratives, local cultural values, and global audience trust in international communication practices.
Therefore, we need to go beyond mere technological analysis and incorporate generative artificial intelligence into the realm of cultural reflection and critique. By revealing the infrastructural and simulated characteristics of artificial intelligence, we can correctly understand its complex affordances and actively engage in its management and application, promoting benevolent international communication practices.
2. Generative AI as Infrastructure for International Communication
Generative artificial intelligence large models can efficiently automate the analysis of internet users and content, as well as awaken and cultivate shared emotions among global media users, thus possessing the potential to become an important infrastructure for constructing new types of international communication content and methods.
Compared to traditional manual statistics, large model data analysis targets a much larger and more diverse set of data, including social media texts, images, videos, sensor data, etc. Accurate analysis of these multimedia content usually requires complex technical conditions such as deep learning, natural language processing, and graph neural networks. This means that international communication actors can utilize artificial intelligence technology for more effective public opinion and user analysis and “outbound” content review, supporting the overall deployment of national external communication, thereby enhancing the effectiveness of international communication in the “diverse content arena.” Specifically, Chinese international communication actors can use natural language processing (NLP) technology and topic modeling techniques (such as LDA) to analyze daily hot topics on overseas social platforms, predict and judge international topics suitable for dissemination on global social platforms, and actively lead the international communication agenda. Meanwhile, to enhance precise communication targeted at users, big data analysis can identify users interested in content related to China and analyze the multimodal content on their homepages, refining user profiles and discovering content types that can better stimulate positive opinions and expressions from overseas users. Utilizing artificial intelligence for “outbound” content review is also an important task. On the one hand, this is due to the existence of numerous difficult-to-distinguish information sources in the field of international communication, necessitating the self-referential verification of artificial intelligence technology; on the other hand, because almost all global social media platforms have established stable regulatory models using artificial intelligence technology for auditing, enhancing the support for AI audit technology is beneficial for the international integration of Chinese “outbound” platforms and content. For instance, Meta set a universal technical standard for identifying AI content in February 2024, which will label identified AI videos, audio, and images as “Made with AI.” Additionally, content filtering technology based on copyright information databases is also widely applied on foreign social websites like YouTube. Therefore, intelligent auditing that complies with the content dissemination policies of target countries becomes crucial for the legitimacy, compliance, and value recognition of Chinese platforms and content “going abroad.” Currently, Chinese internet platforms already possess rich cutting-edge experiences in the research and development of artificial intelligence auditing technology. For instance, Toutiao launched the CID system in 2017 to protect short video copyrights; the video-sharing website Bilibili can process 300 screenshots per second in pornography review—suspicious images undergo reading, transmission, computation, and return results, with an average response time of about 2.2 seconds. The main challenge now is to incorporate the cultural management systems of different countries and platform user policies into the technical application standards for “outbound” content review, establishing a long-term mechanism for regulating international communication content based on artificial intelligence infrastructure.
Furthermore, generative artificial intelligence also serves as an important “emotional machine” in international communication practices. The digital technology revolution has brought about changes in information exchange methods, continuously dissolving traditional geographical boundaries and related identity constraints, allowing cultures and lifestyles from different regions to achieve widespread dissemination, thus promoting the formation of a networked community ecology bound by emotions and shared experiences. Compared to the past, international communication in the digital age significantly appeals to users’ emotions and feelings, and the ability to successfully awaken emotions and create immersive emotional experiences has become a crucial dimension for measuring communication effectiveness. In practical terms, due to the potential ambiguities and polysemy that complex texts may cause during cross-cultural translation, direct and intuitive emotional expressions often attract international audiences more effectively, which is also why short videos have become a dominant global content product in recent years. Currently, platforms like TikTok and Kwai have had a tremendous impact overseas, drawing significant attention from Western governments such as the United States. Short videos, characterized by their brevity and rapid content delivery, intensively output emotional narratives in just a few minutes or even seconds, narrowing the gap between users and the characters and life experiences depicted in the videos. This also fosters interest aggregation among users, forming emotional clusters that are easier for algorithms to detect and summarize. Artificial intelligence algorithms, leveraging their powerful data processing, analytical capabilities, and learning optimization abilities, can analyze users’ emotions, content preferences, and potential needs, achieving “collaborative filtering” of users. In this process, the accuracy of match analysis and personalized recommendations continues to improve. Thus, interpersonal emotional communication, under the intervention of artificial intelligence technology, opens a long cycle across diverse cultural backgrounds. In fact, the successful “outbound” of Chinese short video platforms relies on the continuous “feeding” of domestic user data, promoting the optimization of automated recommendation algorithms and quickly establishing collaborative filtering mechanisms that aggregate the emotional tendencies and interests of overseas users, ensuring that the information accessed by overseas users aligns more closely with their personalized preferences and emotional needs. In this process, artificial intelligence becomes an important infrastructure for connecting cross-cultural consumer groups and a mediating network actor that creates emotional connections. Compared to seeking “value alignment” of technology and social civilization within the framework of policy regulations, technology platforms are more likely to proactively utilize artificial intelligence infrastructure to create “emotional alignment” between individuals and groups, quantifying and aggregating users’ emotional needs, and enhancing user content experiences and platform usage habits through the efficient preference analysis and prediction capabilities of artificial intelligence technology. This also creates conditions for international communication aiming to evoke “global empathy.”
3. Generative AI as a “World Simulator”
Digital technology is not only changing traditional cultural production methods but also carving out broader space for global cultural dissemination. Currently, generative artificial intelligence is driving the cross-cultural adaptation and large-scale “outbound” of multimodal cultural products, enabling local cultural products to connect more efficiently with global markets, becoming a key technological engine for generating and evolving global cross-cultural virtual spaces.
AI Generated Content (AIGC) refers to a new content production method that utilizes artificial intelligence technology (generative AI pathways) to generate content, which can be categorized by modality into images, texts, audio, videos, etc. According to statistics from investment firm a16z, globally, the annual revenue of AIGC products in image generation, copywriting, and code writing has exceeded $100 million. Reports from various authoritative research institutions agree that AIGC has immense potential to cover all application scenarios from multimedia content production to overseas dissemination in international communication, and can drive the innovative participation of diverse stakeholders in the industry based on intelligent industrial clusters. In terms of content production for cross-cultural communication, mainstream media in China have begun to explore the innovative application of generative artificial intelligence technology in video product production. Since the beginning of 2024, China Central Radio and Television has successively launched AI video works such as “Qianqiu Shisong.” AI-generated virtual anchors have also become a global innovative content production practice, such as the multilingual AI anchor Sana developed by the Indian media group, and the realistic AI anchors and original digital human anchors launched by Chinese media. Additionally, artificial intelligence can simplify the process of content distribution to overseas audiences, increasing production efficiency through AIGC. With the expansion of the global online literature and web drama market, domestically produced original cultural content is accelerating its global popularity, among which artificial intelligence translation has become an indispensable support. For instance, WebNovel, a portal under the Chinese company Yuewen, has begun leveraging artificial intelligence to translate works into multiple languages, ensuring the accurate understanding of traditional Chinese cultural vocabulary and idioms through continuous training and vocabulary accumulation. According to the “2023 China Online Literature “Outbound” Trend Report,” with the assistance of generative artificial intelligence, multilingual transformation can break through the constraints of productivity and cost, increasing the translation efficiency of online literature nearly a hundredfold and reducing costs by over 90%. While artificial intelligence applications save labor costs, they also generate enough curiosity among international audiences as a new productive force. However, from the perspective of the long tail effect, international communication still needs to create more diverse experiences for users through the development of generative artificial intelligence, further enhancing users’ emotional engagement.
In fact, even before generative artificial intelligence gained widespread attention, we had already seen how virtual reality and sensory immersion significantly impact individual empathy. The foundational logic is to shape specific virtual environments, allowing individuals to deeply enter different cultural worlds and gain new perceptions and aesthetic experiences. Artificial intelligence technology can not only assist virtual videos in achieving better audiovisual effects, thus being widely applied in the upgrading of media formats such as games, movies, and animations, but it can also use low-threshold AIGC technology to create virtual scenes for individuals, supporting the storytelling of digital narratives and stimulating public participation, thereby mobilizing a broader range of international communication actors and invigorating the emergence of richer cultural scenes, enhancing the vitality of cross-cultural content production. In this regard, Edward W. Soja’s concept of “Third Space” offers epistemological insights. The so-called “Third Space” refers to a heterogeneous space that merges reality and imagination, serving as a flexible presentation strategy. As a transitive concept, the essence of “Third Space” is its ability to re-coordinate the integration of cultural representations, continually self-replicating to ensure the inclusion of different heterogeneous “others,” making discussions and renegotiations about boundaries and cultural identity possible. Following this definition, the metaverse virtual space constructed by generative artificial intelligence has the potential to become a “Third Space” that transcends national and community boundaries, promoting the coexistence and integration of heterogeneous cultures.
For example, utilizing extended reality (XR) technology based on artificial intelligence algorithms, actors can create more immersive cross-cultural communication scenarios. In the large multiplayer online virtual reality game “VR Chat,” individuals can engage in first-person communication with players from different cultural backgrounds using 3D virtual modules, achieving closeness and understanding. By combining augmented reality (AR) with creative arts, digital technology can organically integrate artificial virtual scenes with real-world settings, evoking different cultural and emotional experiences, creating a “parallel world” that transcends the boundaries of reality and imagination. Although these technologies are currently more applied in cultural dissemination scenarios such as AR tour guides that add dynamic effects to cultural relics, as devices like Apple’s Apple Vision Pro develop, cross-cultural communication actors can collectively create based on more open spaces, constructing more integrated experiences of virtual and real cultural fusion.
It is important to emphasize that in the multimodal communication field driven by AIGC, video generation technology carries the most extreme sensory experience trend and has broad application prospects in international communication for evoking empathy among cross-cultural groups. Currently, the most representative video generation technology is undoubtedly the text-to-video model Sora launched by OpenAI in early 2024. From a technical feasibility perspective, it is expected to combine with virtual technologies such as VR and AR in the future and connect to wearable devices, directly generating tangible 3D virtual events. In the future, we can envision users entering virtual spaces and interacting with scenes and events generated in real-time, transforming virtual viewing into practical action, aiding effective information production and news dissemination. Scenes constructed and refined by AI are also more conducive to stimulating user empathy, allowing audiences to continuously form new cognitions about themselves and the external social power structures through experiences that blur the lines between reality and virtuality. OpenAI defines Sora on its official website as “a universal simulator for constructing physical worlds”—indicating that Soja’s imagination of the “Third Space” is on a solid technological evolutionary path to becoming a reality. The potential of generative artificial intelligence to simulate and interact with the world can become a new starting point for innovation in international communication concepts and practices, helping us explore the ecological path of constructing virtual parallel worlds and advancing global cultural integration.
However, the basic logic of political economy still plays a crucial role. The Stanford University “AI Index Report” points out that in 2023, 61 notable AI models originated from American institutions, surpassing the 21 from the European Union and 15 from China, with private investment in artificial intelligence in the U.S. being nearly nine times that of China. This poses significant obstacles to eliminating the implicit biases of artificial intelligence and the conservative power structures in cultural dissemination. However, from the perspective of domestic technological iteration speed, various modalities show the characteristic of lower maturity and faster iteration speed, with breakthroughs in text and image generation occurring almost every one to two months, as evidenced by the aforementioned Vidu large model. Therefore, although Sora has not yet been publicly deployed and the algorithm information disclosed to the outside world is quite limited, China can still break down technological barriers and localize data training to proactively layout the application concepts and exploratory practices of AI audiovisual generation technology and its related industries. As technological conditions reach a more mature stage, the blueprint for building a virtual “Third Space” can be transformed into a more influential cultural action that attracts a wide range of international communication actors.
4. Reflection: The Significance and Practice of Co-Governance by Diverse Subjects
News communication is a human endeavor, following the value principle centered on human subjectivity. International communication is based on the mutual observation and understanding of cross-cultural human individuals, representing a cultural practice that seeks the coordination of human subjectivity. In the current context where generative artificial intelligence is increasingly becoming a key actor in the global digital media ecosystem, how to construct an international communication structure based on the principles of human subjectivity and participation and co-governance has become an urgent issue.
Generative artificial intelligence, in reshaping international communication practices, has given rise to or intensified a series of value dimension issues. For instance, the ethical issues associated with the automated generation of content and “deep fakes”. We must pay close attention to deep fakes in international communication practices not only because they generate false information but also because they can create a mechanistic and systematic artificial intelligence illusion that leads people to lose trust in authenticity and historical sense. When multimodal AIGC can perfectly combine with immersive virtual experiences, investing in and believing in the virtual world may become an escapist choice for individuals in the real world. In such profound sensory anesthesia, how can we ensure that humanity maintains its concern and cognitive ability regarding reality? Furthermore, the powerful text learning and multimodal generation capabilities of generative artificial intelligence can easily lead to the “knowledge interruption” problem in complex audiovisual content production. When artificial intelligence can fully replace us in deciding what images and angles to present when constructing specific scenes, what sound and light elements to schedule, and based on what logic to edit, the primary creative activities of human subjectivity will stagnate, and the willingness of many amateur or semi-professional creators (who are also media actors with significant cultural potential) to learn and use digital production technologies will continue to decline. For international communication, this often brings the risk of losing cultural consciousness and may perpetuate cultural stereotypes.
Therefore, this article advocates the “human-in-the-loop” concept, promoting the active insertion of human subjectivity values in various aspects of cultural production and circulation, and encouraging human ethics to take on the responsibility of regulating the artificial constructs of intelligent technology. In practical terms, the starting point for implementing the “human-in-the-loop” concept is to place high importance on the “value alignment” of technology research and application, which means promoting the values achieved by artificial intelligence in task execution to remain consistent with the values that users care about. Currently, artificial intelligence technology companies such as OpenAI and DeepMind, acquired by Google, have established teams focused on solving the “value alignment” issue, and many third-party supervisory bodies, standards organizations, and government organizations are also involved. In simple terms, “value alignment” is a benign communication and collaboration mechanism dominated by human values between dynamic machines and dynamic users. Ideally, this collaborative mechanism should be co-designed by research institutions, enterprises, users, academic research institutes, and governments. However, it is important to note that the value principles established by different “value alignment” teams cannot be universally applicable across different contexts. In the complex global discourse environment, ethics and values themselves exist in different directions and scales, involving various factors such as different national political stances and transnational capital interests. Due to complex historical and political reasons, the cutting-edge design standards of generative artificial intelligence technology are primarily controlled by Western countries, while platforms like TikTok face political scrutiny overseas, revealing the significant limitations of transnational ideological contexts on media technology applications. This situation often renders the “value alignment” of artificial intelligence merely a realignment with the development perspective of “Western centrism,” exacerbating the unfair circumstances of international marginalized groups and creating a global trust crisis. Sora has raised concerns about cultural diversity due to its undisclosed training data sources: the “Western-centric” training data for large models may lead domestic communication actors using this technology into dilemmas of cultural adaptation and expression. The realization of “human-in-the-loop” requires breaking the existing monopoly of discourse power and providing space for showcasing the cultural differences and characteristics of different countries in ethical regulation. As Simondon pointed out, “Humanity should always be the eternal organizer of a technological object society, like a conductor in an orchestra.” International communication practices should always regard the technological system related to generative artificial intelligence as a basic framework for repairing cross-cultural interaction relationships, ensuring the joint progress of social culture, ethical values, policy systems, legal regulations, and technological systems.
Brazilian media theorist Vilém Flusser believes that images before the birth of writing are depictions of nature, while the emergence of linear texts creates conceptual thinking, making subsequent images a visualization of texts. Therefore, the visualization of technological images differs fundamentally from the depiction of objects, reflecting the omnipresent discourse power behind modern visual representation. Applications like Sora, which generate multimodal AI content, do not possess the capacity for direct imitation of nature due to the lack of embodied perception organs; instead, they merely re-virtualize digital codes. Thus, their conceptual understanding of the world is not universally inclusive. Therefore, we need to be more vigilant about the data support structures behind artificial intelligence multimodal generation. If we cannot promote the cultural diversity of training databases, we will be unable to break the cultural representation monopoly of the traditional “second space,” making it difficult to construct an inclusive “third space” where center and periphery can flow in cross-national communication.
5. Conclusion
Generative artificial intelligence large models are important infrastructures for future international communication practices. Enterprises, academia, and governments should collaborate to promote the construction of autonomous technology design standards based on the unique characteristics of each country, formulate relevant data collection policies and laws, and facilitate the implementation of inclusive global artificial intelligence norms through active cross-cultural dialogue. Breaking existing communication biases and power structures is a prerequisite for achieving greater autonomy in technology, applications, and models. Currently, China is vigorously promoting the independent research and development of artificial intelligence technology. For instance, Baidu has released a knowledge-enhanced large language model called “Wenxin Yiyan” based on the Paddle platform, and Alibaba has also launched a large language model called “Tongyi Qianwen.” Both companies claim that they will integrate large language models into various projects in the future. These R&D initiatives provide a solid foundation for the autonomous construction of artificial intelligence technology norms in China, highlighting the significance of diverse social subjects participating in the co-construction of artificial intelligence infrastructure.
At the same time, under the new system of international communication driven by generative artificial intelligence, we should conceptually transcend the simplistic binary of technological accelerationism and technological threat theory. In practice, we must strive to gain international first-mover advantages while preventing automated systems from undermining human freedom, comprehensive development, and the diverse evolution of human civilization. International communication in the digital age not only needs to broaden the “outbound” paths of local cultures but also emphasizes refining the production and dissemination needs of cultural creativity, establishing a cross-field and cross-disciplinary platform for theoretical resource exchange, consciously guiding the foundational research and industry applications of artificial intelligence, and actively facilitating the development of diverse patterns of international communication. Ultimately, we should coordinate the goals of constructing a global community of shared human destiny with advancing the economic development of artificial intelligence, creating a broad “third space” for human civilization exchanges and mutual learning in both real and virtual dimensions.
Disclaimer:
This article is an observational or commentary piece published by a third-party self-media author. All text and image copyrights belong to the author and only represent the author’s personal views, unrelated to Beijing Internet of Things Intelligent Technology Application Association. The article is for reference only and readers should verify the relevant content themselves.