█ What Exactly Is AI?AI is an abbreviation for artificial intelligence. Artificial, many students may misinterpret as some adjective related to art. In fact, artificial means “man-made,” which is the opposite of natural. Intelligence is not easily misinterpreted; it means “intelligence.” The name of Intel Corporation is based on the first five letters of this word. Combined, AI means “artificial intelligence,” which is the intelligence created by human means. There are many definitions of AI in the industry. A more academic definition states:AI is a comprehensive science that studies and develops theories, methods, technologies, and application systems used to simulate, extend, and enhance human intelligent behavior. This definition is quite convoluted and can be overwhelming. In fact, we can break down AI for better understanding. First, the essential attribute of AI is that it is a science, a technical field. It involves knowledge from various disciplines, including computer science, mathematics, statistics, philosophy, and psychology, but overall, it falls under the category of computer science. Secondly, the research goal of AI is to enable a “system” to possess intelligence. This “system” can be a software program, a computer, or even a robot. Thirdly, what level of intelligence qualifies as true intelligence? This is the crux of the issue. Currently, being able to perceive, understand, think, judge, and make decisions like a human is considered the realization of artificial intelligence. Coupled with physical carriers like robots and mechanical arms, AI can also achieve mobility. With these three points combined, understanding the definition of AI becomes much easier.█ What Is the Difference Between AI and Ordinary Computers? AI is still based on the fundamental operations of computers, utilizing semiconductor chip technology (hence often referred to as “silicon-based”) and some computer systems and platforms. So, how does it differ from traditional computer programs? Traditional computer programs are essentially a set of rules. Programmers tell the computer the rules through code, and the computer processes input data based on these rules. For example, the classic “if…else…” statement— “If over 65 years old, then retire. Otherwise, continue working.” The computer program will then judge and process all input age data according to this rule. However, in real life, many factors (like images and sounds) are extremely complex and diverse, making it challenging to provide fixed rules for the computer to achieve high accuracy in judgment and processing. For instance, judging whether a dog is a dog. There are many breeds of dogs, each with different colors, sizes, and facial features. Dogs can also display various expressions and postures at different times. Furthermore, they exist in different background environments.
Therefore, the images of dogs captured by the computer through a camera are infinite. It is difficult to help the computer make judgments through a limited number of rules. To enable computers to achieve human-like intelligence, simple rule-driven approaches are inadequate; instead, we should teach them like children, continuously inputting data and answers, allowing them to summarize features and form their own judgment rules.
In other words, in classic program design, people input rules (i.e., programs) and data, and the system outputs answers.
In contrast, the calculation process of AI consists of two steps: The first step involves inputting data and the expected answer, with the system outputting rules. The second step applies the output rules to new data and then outputs answers.
The first step can be referred to as “training.” The second step is the actual “working” phase. This illustrates a typical difference between traditional computing programs and current mainstream AI technologies. (Note, I am referring to “current mainstream AI.” Some “historical AI” and “non-mainstream AI” have different approaches and cannot be generalized.)█ What Categories of AI Exist? As mentioned earlier, artificial intelligence is a vast scientific field. Since its formal inception in the 1950s, many scientists have conducted extensive research around AI, producing remarkable results. Based on different theoretical directions, these studies have been divided into various schools. Notable representatives include the symbolic school, connectionist school, and behaviorist school. These schools do not have a right or wrong distinction and often have some overlaps. In the early days (1960-1990), symbolic AI (represented by expert systems and knowledge graphs) was mainstream. Later, starting from 1980, connectionism (represented by neural networks) rose to prominence and remains so to this day.
In the future, new technologies may emerge, forming new schools as well. Besides direction, we can also classify AI based on levels of intelligence and application domains. In terms of intelligence levels, AI can be categorized as: Weak AI, Strong AI, and Super AI. Weak AI specializes in a single task or a group of related tasks and lacks general intelligence capabilities. We are currently at this stage. Strong AI is more advanced, possessing certain general intelligence capabilities, able to understand, learn, and apply to various tasks. This is still in the theoretical and research phase and has not yet been realized. Super AI is, of course, the most powerful. It surpasses human intelligence in almost all aspects, including creativity and social skills. Super AI is the ultimate form we hypothesize it could achieve in the future. As for the classification of AI based on application domains, we will discuss that later.█ What Is Machine Learning? Actually, we have already mentioned machine learning while discussing rule summary. The core idea of machine learning is to build a model that can learn from data and use this model for predictions or decisions. Machine learning is not a specific model or algorithm. It encompasses many types, such as:Supervised Learning: The algorithm learns from a labeled dataset, where each training sample has a known outcome.Unsupervised Learning: The algorithm learns from an unlabeled dataset.Semi-supervised Learning: Combines a small amount of labeled data with a large amount of unlabeled data for training.Reinforcement Learning: Learns which actions yield rewards and which lead to penalties through trial and error.█ What Is Deep Learning? Deep learning, specifically, refers to deep neural network learning. Deep learning is an important branch of machine learning. Under machine learning, there is a “neural network” route, and deep learning is an enhanced version of “neural network” learning. Neural networks represent connectionism. As the name suggests, this route mimics the working principles of the human brain, establishing connection models between neurons to achieve artificial neural computations.
The “depth” in deep learning refers to the number of hidden layers in the neural network. Classic machine learning algorithms use neural networks with an input layer, one or two “hidden layers”, and an output layer. Deep learning algorithms utilize more “hidden layers” (hundreds of them). Its capabilities are significantly enhanced, allowing neural networks to accomplish more complex tasks. The relationship between machine learning, neural networks, and deep learning can be seen in the diagram below:
█ What Are Convolutional Neural Networks and Recurrent Neural Networks? Since the rise of neural networks in the 1980s, many models and algorithms have been developed. Different models and algorithms have their unique characteristics and functions. Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) are well-known neural network models that emerged around the 1990s. Their specific working principles are quite complex. Just remember: Convolutional Neural Networks (CNN) are neural networks used to process data with a grid-like structure (such as images and videos). Therefore, they are typically used in computer vision for image recognition and image classification. Recurrent Neural Networks (RNN) are used for processing sequential data, such as language models and time series predictions. Hence, they are often applied in natural language processing and speech recognition.█ What Is a Transformer? The transformer is also a neural network model. It is younger (introduced by Google research team in 2017) and more powerful than CNNs and RNNs. As a non-professional, you don’t need to delve into its working principles; just know: 1. It is a deep learning model; 2. It employs a mechanism called self-attention; 3. It effectively addresses the bottlenecks of CNNs and RNNs; 4. It is well-suited for natural language processing (NLP) tasks. Compared to RNNs, its calculations can be highly parallelized, simplifying the model architecture and greatly improving training efficiency; 5. It has also been extended to other fields, such as computer vision and speech recognition. 6. Most of the large models we frequently mention are based on transformers.
There are many types of neural networks, and I found a diagram online for reference:
█ What Is a Large Model? The recent AI boom is largely due to the rise of large models. So, what exactly is a large model? A large model is a machine learning model with a massive parameter scale and complex computational structure. Parameters refer to the variables learned and adjusted during the model training process. Parameters define the model’s behavior, performance, implementation costs, and computational resource requirements. In simple terms, parameters are the parts used by the model to make predictions or decisions. Large models typically have millions to billions of parameters. Conversely, models with fewer parameters are considered small models. For some niche fields or scenarios, small models are sufficient. Large models require large-scale data for training and consume considerable computational resources. There are many categories of large models. The large models typically referred to mainly refer to language large models (trained on text data). However, there are also visual large models (trained on image data) and multimodal large models (which include both text and images). The core structure of most large models is based on transformers and their variants.
In terms of application domains, large models can be divided into general large models and industry-specific large models. General large models have broader training datasets, covering a more comprehensive range of fields. Industry-specific large models, as the name suggests, are trained on data from specific industries and are applied in specialized areas (e.g., finance, healthcare, law, industry).█ What Is the Essence of GPT? GPT-1, GPT-2, GPT-3, and GPT-4, among others, are language large models launched by the American company OpenAI, all based on the transformer architecture. GPT stands for Generative Pre-trained Transformer. Generative indicates that the model can generate coherent and logical text content, such as completing conversations, creating stories, writing code, or composing poems and songs. Here, it’s worth mentioning that the term AIGC refers to AI Generated Content, which can include text, images, audio, video, etc. The GPT series focuses on text, while Google has also launched a competing product, BERT. For text-to-image generation, notable examples include DALL·E (also from OpenAI), Midjourney (widely known), and Stable Diffusion (open-source). For text-to-audio (music), there are Suno (OpenAI), Stable Audio Open (open-sourced by Stability.ai), and Audiobox (Meta). For text-to-video, there are Sora (OpenAI), Stable Video Diffusion (open-sourced by Stability.ai), and Soya (open-source). Images can also generate videos, such as Tencent’s Follow-Your-Click.
AIGC is a definition from an “application dimension” perspective; it is not a specific technology or model. The emergence of AIGC expands the capabilities of AI, breaking the previous limitations of AI primarily used for recognition and broadening application scenarios. Now, let’s continue explaining the second letter of GPT—Pre-trained. Pre-trained indicates that the model is first trained on a large-scale unlabeled text corpus to learn the statistical laws and underlying structures of language. Through pre-training, the model gains a certain level of generality. The larger the training data (like web text, news, etc.), the stronger the model’s capabilities. The public’s attention to AI has mainly stemmed from the explosive popularity of ChatGPT in early 2023. Chat in ChatGPT means chatting. ChatGPT is an AI conversation application service developed by OpenAI based on the GPT model (it can also be understood as GPT-3.5).
Through this service, people can personally experience the power of the GPT model, which is beneficial for promoting and marketing the technology. It has been proven that OpenAI’s strategy was successful. ChatGPT has fully captured public attention and successfully propelled the development wave in the AI field.█ What Can AI Do? The applications of AI are extremely broad. In summary, compared to traditional computer systems, the expanded capabilities that AI can provide include: image recognition, speech recognition, natural language processing, and embodied intelligence. Image recognition is sometimes classified as computer vision (Computer Vision, CV), enabling computers to understand and process images and videos. Common applications include cameras, industrial quality inspection, and facial recognition.
Speech recognition involves understanding and processing audio to obtain the information contained in the audio. Common applications include mobile voice assistants, telephone call centers, and voice-controlled smart homes, often used in interactive scenarios. Natural language processing, as introduced earlier, enables computers to understand and process natural language, knowing exactly what we are saying. This is very popular and is often used in creative work, such as writing news articles, drafting written materials, video production, game development, and music creation. Embodied intelligence refers to artificial intelligence being embedded in a physical form (“body”) that interacts with the environment to obtain and demonstrate intelligence. Robots equipped with AI fall under embodied intelligence. The “Mobile ALOHA” introduced by Stanford University earlier this year is a typical household embodied robot. It can cook, brew coffee, and even play with cats, going viral online. It is worth mentioning that not all robots are humanoid robots, nor do all robots utilize AI.
Humanoid robots AI excels at processing vast amounts of data, learning and training through massive datasets while also accomplishing tasks that humans cannot. In other words, it finds potential patterns within the vast amounts of data. Currently, AI is applied in various vertical industries, primarily expanding based on the capabilities mentioned above. Let’s give some common examples. In healthcare, AI can analyze X-rays, CT scans, and MRI images to help identify abnormal areas and even make diagnostic judgments. AI can also identify cellular variations in tissue samples to assist pathologists in cancer screening and diagnosing other diseases. AI can analyze patients’ genomic data to determine the most suitable treatment plans. It can also assist in predicting disease trends based on patients’ medical histories and physiological indicators. In drug development, AI can help simulate chemical interactions, shortening the drug development cycle. During severe public health events, AI can analyze epidemiological data to predict disease transmission trends. In finance, AI can monitor market dynamics in real-time, identify potential market risks, and formulate corresponding risk hedging strategies. AI can also assess credit risks by analyzing borrowers’ credit records, income situations, and consumption behaviors across multiple dimensions. Of course, AI can also provide the most suitable investment portfolio recommendations based on investors’ personal financial situations, risk preferences, and return objectives. Similar examples are innumerable. In industrial manufacturing, education and tourism, commercial retail, agriculture and forestry, public safety, government governance, and almost every field, AI has already established practical application scenarios and cases. AI is changing society and transforming the work and lives of each of us.█ How Should We View AI? The commercial and social value of AI is undeniable. Its rise is also unstoppable. From the perspective of enterprises, AI can automate repetitive and tedious tasks, improve production efficiency and quality while reducing production costs and labor costs. This advantage is crucial for manufacturing and service industries, directly impacting a company’s competitiveness and even survival. From the government’s perspective, AI can not only enhance governance efficiency but also bring new business models, products, and services, stimulating the economy. Powerful AI is also a form of national competitiveness. In the technological competition and defense sector, if AI technology lags behind others, it could lead to severe consequences. From an individual perspective, AI can help us accomplish certain tasks and enhance our quality of life. From the standpoint of humanity as a whole, AI can play a vital role in disease treatment, disaster prediction, climate forecasting, and eradicating poverty.
However, everything has two sides. As a tool, AI has both advantages and disadvantages. The most immediate drawback is that it may threaten a large number of human jobs, leading to widespread unemployment. According to McKinsey’s research, between 2030 and 2060, approximately 50% of jobs may gradually be replaced by AI, especially for knowledge workers.
Image source: The New Yorker magazine
In addition, AI is used to wage wars, commit fraud (imitating voices or deepfakes for scams), and infringe on citizens’ rights (excessive information collection and invasion of privacy). If only a few companies possess advanced AI technology, it may exacerbate social inequality. AI algorithm bias could also lead to unfairness. As AI becomes increasingly powerful, it may lead to a dependency on AI, causing people to lose their ability to think independently and solve problems. The creative power of AI could potentially diminish human motivation and confidence in creativity. There are also a series of issues surrounding AI development, including safety (data breaches, system crashes) and ethical concerns.
We currently do not have reliable solutions to any of these issues. Therefore, we can only explore, reflect, and solve them gradually as we develop AI. A sense of vigilance and precaution regarding AI is essential. For ordinary people, the most practical approach is to first understand and learn about it. Start by learning to use common AI tools and platforms to enhance work efficiency and improve quality of life. As the saying goes: “In the future, it is not AI that will eliminate you, but those who master AI.” Instead of being anxious, it is better to face it boldly and embrace it actively, taking control as soon as possible. Well, that concludes today’s article. For an ordinary person, knowing these AI basics is the first step to embracing AI. At least when chatting with others about AI, you won’t be completely lost. Thank you all for your patient reading, and see you next time!
【Image source】 Fresh Date Classroom
This content is sourced from the internet for sharing and reference only. The copyright of the article belongs to the original author and source. We pay our respects to the authors of the articles we repost in this public account and have tried our best to indicate the author’s source. If there are omissions or issues related to copyright, please contact us for deletion.
– END –
Image editor: Feng Yiqi
Editor in charge: Jiang Keping