How ChatGPT Changes Knowledge Production

How ChatGPT Changes Knowledge Production

ChatGPT, developed by the American artificial intelligence research company OpenAI, has garnered widespread attention globally since its launch, sparking extensive reflection and discussion.ChatGPT is a large language model (LLM) capable of autonomous learning based on data and intelligently generating complex text.However, currently, ChatGPT and other large language models cannot ensure the accuracy of generated text and may even distort scientific facts and spread misinformation. Nevertheless, it is foreseeable that this type of artificial intelligence-generated content (AIGC) technology will have a profound impact on the definition of knowledge, the production of knowledge, and the status of technical personnel. Reporters interviewed relevant scholars on these topics.

ChatGPT Crosses New Thresholds in AI Technology

Sam Illingworth, Assistant Professor in the Department of Learning and Teaching Enhancement at Edinburgh Napier University, stated in an interview that it is currently difficult to say what kind of impact ChatGPT will have; it entirely depends on how people choose to use it. Besides teaching people how to use ChatGPT to work more effectively, we should also carefully consider the limitations of this technology to ensure it is used ethically. InIllingworth’s view, ChatGPT is a form of weak artificial intelligence, which is an intelligent machine that cannot achieve true reasoning and problem-solving. Such technology will continue to be an important part of people’s lives and work in the future. Of course, humans do not need to react excessively to this, as weak AI is more about assisting rather than “replacing our jobs.”

Kentaro Toyama, Professor at the School of Information at the University of Michigan, believes that a series of AI technologies represented by ChatGPT have dual aspects. On the positive side, most human work may be completed by machines; on the negative side, ChatGPT technology may cause multiple harms. Specifically, it may amplify various fraudulent activities, lead to more unemployment, and disrupt everyday interpersonal relationships. Toyama suggests that powerful technologies require strict regulation, and innovation should be slowed down to allow time for reflection and the formulation of appropriate regulatory policies.

Technicians Are Replacing Experts in Creating Authoritative Knowledge

Blayne Haggart, Assistant Professor of Political Science at Brock University in Canada, states that for centuries, people have equated knowledge with science, granting reason and theory a superior status. Theory consists of a series of ideas about how the world works, shaping the environment that influences human behavior. Science aims to test and refine these ideas in the real world to understand it. Theories are always influenced by limited human perception, making it difficult to overcome biases and extreme limitations. However, by critically examining theories and methods, humanity hopes to increase its understanding of the world. In recent years, with the development of digitization, expertise related to data collection and processing has represented the cutting edge of various industries. ChatGPT continues this trend, further proving the rise of a notion: if enough data can be collected and sufficiently powerful computing capabilities are available, authoritative knowledge can be “created.” Under this notion, technicians rather than scientists are seen as the most knowledgeable individuals, and the authority of knowledge no longer belongs to experts in various fields but to those who can create and manipulate digital data. In short, knowledge itself is being redefined.

In Haggart’s view, contrary to scientific thinking and its emphasis on theory construction and context-specific knowledge, ChatGPT and its underlying thinking do not equate knowledge with scientific understanding but rather with relevance. It reflects the ideas of technicians rather than those of scientists. Gaining knowledge through relevance presents a potential tendency for big data and AI technologies: as long as there is enough data and sufficient computing power, computers can identify various correlations without any theory. Like all machine learning models, ChatGPT breaks down words, sentences, paragraphs, and texts into data, aiming to find patterns of vocabulary and sentences that co-occur in specific contexts. The model is adaptive and can produce complex outputs, representing a victory of technology and funding. However, fundamentally, ChatGPT remains merely a pattern recognition technology, categorizing samples based on their characteristics using computational methods.

The Real Threat Lies in Dataism

José van Dijck, Professor of Media and Digital Society at Utrecht University in the Netherlands, refers to the belief that data can explain everything as dataism. Businesses, scholars, and governments firmly believe that data provides an objective, neutral, and even “revolutionary” way to better gain profits, understand society, and conduct national affairs.Dijck states that data has never existed independently; everything about data, such as its selection, collection, storage, and use, is influenced by human limitations and the environment of data collection and use. Data generated by humans has inherent limitations, which also implies that the scientific understanding exhibited by computers is an illusion. In scientific thinking, the legitimacy of knowledge depends on whether scientists draw conclusions according to agreed-upon methods and advance theories, which is the process of creating knowledge. In contrast, the process of machine learning is so complex that even those operating machine learning often find its internal structure a mystery.

Machine learning itself is not the problem; dataism is the real threat.Illingworth states that the viewpoint guided by dataism is very dangerous. Inaccuracies in data generation and collection have already led to many scandals. Before more powerful AI emerges, the methods of data collection and analysis will always be mixed with human interactive factors, meaning there is a possibility of data errors and misuse. The emergence of ChatGPT serves as a timely reminder that we need to think more critically about the processes of data collection, analysis, and usage, and we should not treat the content generated from any data set as absolute facts.

Haggart believes that ChatGPT transforms various materials into data and uses this data without the explicit informed consent of the creators. Large language models like ChatGPT raise two issues. First, non-expert populations find it challenging to evaluate the output content. Second, there is a high trust in relevance. When ChatGPT relies on relevance to produce knowledge, it ultimately generates conventional wisdom and popular opinion, but falls far short of accuracy.

Identifying the Power that Controls Knowledge Production

Of course, science will not disappear in a world dominated by dataism, but it will change the hierarchy of knowledge.Haggart states that we need to study which groups are seen as important creators and owners of knowledge. Different forms of knowledge position different groups accordingly, and different types of knowledge workers, such as scientists and technicians, will define and use knowledge in various ways. In a data-driven world, the power to control knowledge lies with those who can mobilize resources to collect and deploy data and computing power and create the algorithms required for machine learning. Their authority stems from the belief of dataists that data and its collection processes are neutral, and the machines they create will produce authoritative and useful knowledge, reflecting the dataist perspective on knowledge.

The simple interface of ChatGPT creates an illusion; its output seems to appear out of thin air, provided by a neutral machine. However, algorithms, computer programs, and machine learning processes are all explicitly designed by people, and the power that designs knowledge production is an ultimate power.Mary L. Gray, a researcher at Harvard University’s Berkman Klein Center for Internet and Society, warns that so-called AI systems always involve behind-the-scenes workers who make decisions about content or data evaluation, which is an internal perspective. From an external perspective, the lack of a scientific validation process and the reliance on technology make people passive recipients of AI, feeling awe rather than understanding towards this form of knowledge. In fact, this deprives individuals of the ability to understand, question, and challenge.

In the absence of the ability to assess the accuracy of the knowledge production process, the only option is to evaluate the output content. To achieve this, people need a form of knowledge that transcends relevance.Haggart believes that the danger of dataism lies in the natural assumption that algorithms determine everything and that algorithmic programs can produce reasonable knowledge. However, this is not the case. In 2016, Microsoft launched the AI chatbot Tay, which was taken offline after just one day for making racist comments. The output of ChatGPT entirely depends on the choices of the team of “trainers” who decide which data and datasets are important and design algorithms to follow the relevance rules they have set. In short, whoever controls the design process of the chatbot shapes the final product.

Making Machine Learning Serve the Public

ChatGPT does not merely automate the writing or research process; it creates an entirely new form of knowledge, granting legitimacy to relevance and embedding the assessment of relevance’s authenticity behind the scenes in programming decisions. This approach places scientific understanding in a secondary position, where science can at best serve as an evaluative role. These entrenched interests pose significant obstacles to ensuring that the development direction of machine learning aligns with public interests.Haggart suggests being vigilant against dataism and cultivating a habit of making machine learning serve the people.

First, due to the complexity and opacity of the machine learning process, all stages must involve humans as directly responsible decision-makers. Whether decision-makers or individuals affected by machine learning, they must be able to explain and understand any decisions made during the automation process. Machine learning should serve as a supplement, not a replacement for human functions.

Second, discussions about data rights in the creation of large datasets need to go beyond concerns about personal identity data. The data rights of artists, writers, and ordinary people should be respected, as their works and expressions form the foundation of large language models. The emergence of ChatGPT will have a direct impact on this group.

Finally, it is essential to prevent companies like OpenAI from treating the public as experimental subjects for effective marketing activities. Currently, a series of articles related to ChatGPT are flooding various platforms, and even academic experts are struggling to understand the implications of ChatGPT technology. It is now crucial to pay attention to these companies to avoid “malicious” behaviors.

For centuries, people have viewed science as a fusion of knowledge formation and rational thinking, adhering to the rigor and transparency of the knowledge production and verification process. However, the current spread of dataism seems to break long-standing methods of knowledge production and scientific ideals. Whether people are willing to maintain their commitment to science and let technology serve scientific understanding is the core issue surrounding the debate on artificial intelligence. ChatGPT cannot answer this question, but people’s attitudes towards ChatGPT conceal the answer.

Source: China Social Sciences Today

Leave a Comment