Source | Asynchronous | Book Giveaway at the End
Where to Start Learning Artificial Intelligence?
Artificial intelligence has become a fundamental discipline of this era, and the emergence of ChatGPT has further pushed AI technology towards generalization. The question is no longer whether to learn artificial intelligence, but how to learn it well.
As an interdisciplinary field, artificial intelligence encompasses knowledge from computer science, statistics, mathematics, and more. This can often confuse beginners, as there seems to be so much to grasp. Where should one start? The book Artificial Intelligence (3rd Edition), rated 9.4 on Douban, provides a comprehensive explanation.
Learning artificial intelligence can be divided into three levels:
· The first level is to understand the necessary mathematics and algorithm knowledge;
· The second level is to master the principles of machine learning, artificial neural networks, and deep learning, which are the core areas of modern artificial intelligence;
· The third level is to understand specific applications of artificial intelligence, such as natural language processing, computer vision, robotics, etc.
As a formal learner, mastering a programming language is also essential, Python is the first choice. This is because it is easy to learn and use, and third-party libraries like NumPy, Pandas, SciPy, and TensorFlow provide rich data processing, scientific computing, and machine learning tools, making Python the most popular programming language in the field of artificial intelligence.
We will start learning with mathematics and algorithm knowledge.
Building a Foundation: Mathematics and Algorithm Knowledge
Mathematical Knowledge plays a foundational and supporting role in the field of artificial intelligence, providing mathematical tools and methods for analysis, modeling, and problem-solving. Therefore, the first step is to be familiar with those foundational mathematical concepts.
Probability Theory and Statistics are used to describe uncertainty and randomness, while statistics are used to infer models and parameters from data.
In machine learning and data analysis, probability theory and statistics are widely used for modeling, prediction, classification, and other tasks. Understanding the basic concepts and methods of probability theory and statistics can help us understand and apply various machine learning algorithms.
Linear Algebra involves concepts such as matrices, vectors, and linear equations, providing mathematical tools for handling high-dimensional data and spatial transformations. Many algorithms and models in artificial intelligence utilize matrix operations and linear transformations.
Calculus is the mathematical discipline that studies changes and limits. In artificial intelligence, calculus is used to describe and optimize functions, such as minimizing loss functions in machine learning to optimize models. Additionally, calculus involves important concepts such as gradient calculation and optimization algorithms, which are crucial for understanding and implementing machine learning algorithms.
In addition, discrete mathematics, graph theory, information theory, and others are also closely related to artificial intelligence and can be further studied. Some foundational algorithms also need to be mastered, and below are two important search algorithms introduced.
Depth-First Search (DFS) is an algorithm that traverses all possible paths until it can no longer continue searching and then backtracks to the previous step to continue searching.
DFS is typically used for graph traversal and state space search, characterized by its ability to search within limited memory space. Understanding the depth-first search algorithm can help us solve graph-related problems and state space search issues.
Breadth-First Search (BFS) is an algorithm that finds solutions by expanding the search layer by layer.
BFS starts from the initial node, first visiting its neighboring nodes, then sequentially visiting the neighbors of those neighbors, and so on, until it reaches the target node or traverses the entire graph. BFS is typically used for finding the shortest path and graph connectivity issues.
With a foundation of mathematics and algorithm knowledge, we can next explore the core of artificial intelligence.
The Core of Modern Artificial Intelligence: Machine Learning and Artificial Neural Networks
Since DeepMind’s AlphaGo defeated human players in 2016, deep learning has become a hot topic in artificial intelligence research. However, for beginners, encountering terms like machine learning, artificial neural networks, neurons, and models can easily lead to confusion.
In fact, deep learning is a branch of machine learning, so to clarify these concepts, we will start with machine learning.
As early as the 1950s, “machine learning” was proposed alongside “artificial intelligence.” However, due to the limitations of computing power and data volume at that time, machine learning was not taken seriously. It wasn’t until the internet era, with Google having the advantages of computing power and data, that breakthroughs in machine learning technology began.
The principle of machine learning is to learn patterns from data through algorithms and models, and make predictions or decisions based on the learned patterns. Core concepts of machine learning include supervised learning, unsupervised learning, reinforcement learning, and common algorithms and models such as decision trees, support vector machines, and neural networks.
Artificial Neural Networks
Artificial neural networks are computational models inspired by biological neural systems, designed to simulate the information transmission and processing between neurons in the human brain. They consist of a large number of artificial neurons (nodes) that are interconnected through connection weights, with structures such as input layers, hidden layers, and output layers.
Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) are two widely used types of artificial neural networks.
CNN is suitable for processing grid-structured data, such as images, to capture local features. RNN is suitable for processing sequential data, such as text and speech, to model temporal information.
Deep learning uses artificial neural networks with multiple hidden layers to learn representations of data, with CNNs and RNNs being commonly used neural network structures. The key to deep learning is to extract and transform data features through multiple layers of nonlinear transformations, enabling the learning and recognition of complex patterns and high-dimensional data.
Deep learning has achieved many groundbreaking results in image recognition, speech recognition, natural language processing, and is widely applied across various fields.
By mastering the concepts and connections of machine learning, artificial neural networks, and deep learning, we can further explore the successful applications of artificial intelligence.
The Revolution of Human-Computer Interaction: Natural Language Processing
Natural Language Processing (NLP) aims to enable computers to understand and process natural language like humans, including speech and text. This involves technologies and tasks such as text analysis, speech recognition, semantic understanding, and language generation.
ChatGPT is a typical application of NLP, based on the GPT (Generative Pre-trained Transformer) model, which itself is built on the Transformer architecture proposed by Google, a framework that uses attention mechanisms to process sequential data.
At the algorithmic level, the GPT model utilizes the encoder-decoder structure of the Transformer. The core of this model is the multi-head self-attention mechanism (multi-head self-attention), which allows the model to focus on different positions in the sequence in parallel while processing the input sequence, thereby better capturing long-distance dependencies in the text.
The GPT model also includes components such as positional encoding (positional encoding) and feed-forward neural networks (feed-forward neural network), which together form the basic structure of the GPT model. During the training process, the GPT model is pre-trained on a large corpus of text, learning the statistical patterns and semantic information within the text data.
During the inference phase, the GPT model can generate coherent and semantically meaningful text outputs, making it highly suitable for dialogue systems, text generation, and other natural language processing tasks. ChatGPT has been specifically fine-tuned and optimized for dialogue generation tasks, performing excellently in conversational interaction scenarios.
With a solid foundation in mathematics and a grasp of the core concepts of artificial intelligence, one can quickly understand the underlying principles of GPT technology. If you want to delve deeper into artificial intelligence knowledge to explore more areas, you can open Artificial Intelligence (3rd Edition), where all the foundational knowledge you want to know about artificial intelligence is contained in this book.

▼Click below to purchase the book, limited-time discount50% off
All the Artificial Intelligence Knowledge You Need to Know Is Here
The book Artificial Intelligence (3rd Edition) is a classic work in this field, hailed as the “Encyclopedia of Artificial Intelligence”. This should be the first book every beginner reads to enter the world of artificial intelligence.
Let’s take a look at the content covered in this book: the history of artificial intelligence, debates on thinking and intelligence, the Turing Test, search, games, knowledge representation, production systems, expert systems, machine learning, deep learning, natural language processing, automated planning, genetic algorithms, fuzzy control, security, etc. It also introduces some new technologies and applications, such as robotics and advanced computer games.
This book provides detailed introductions from foundational knowledge to domain applications, and unlike other technical books, it also includes numerous introductions to important researchers in the history of artificial intelligence development.
Because the author team of this book believes that artificial intelligence should be human-centered, they propose: “Artificial intelligence consists of objects such as people, ideas, methods, machines, and outcomes.”
The three authors of this book are all senior scholars in artificial intelligence, with years of in-depth research and exploration in this field, cultivating numerous professionals in artificial intelligence for academia and industry.
He holds a PhD from the Graduate Center of the City University of New York and teaches computer science at the City College of New York, having published several articles in high-performance computing and artificial intelligence.
He has a PhD and is a professor at Prairie View A&M University, authoring several books including Computational Nanophotonics (CRC Press) and Finite Element Analysis (MLI).
He was a co-author of the 2nd edition of this book, taught at Brooklyn College, authored several books, and was also an international chess master.
The translation team for this book is also strong, led by Dr. Wang Bin, director of Xiaomi’s AI Lab and chief scientist in natural language processing, who, along with Dr. Wang Shuxin and Dr. Wang Pengming, has contributed to this outstanding translation.
To summarize the features of this book:
· Uses clear language and diagrams to help readers easily understand complex concepts and algorithms;
· Practical, containing rich case studies covering a variety of application scenarios such as computer game playing, medical diagnosis, etc.;
· Readable and engaging, introducing many outstanding figures in the field of artificial intelligence, recounting their experiences and contributions;
· Rich supporting resources, with numerous exercises to help readers review knowledge and solidify foundations.
Open Artificial Intelligence (3rd Edition) to unlock a broader world of artificial intelligence!

▼Click below to purchase the book, limited-time discount50% off
Participate in the interaction in the comment area, and click to view and share the activity to your moments. We will select 1 reader to receive an e-book version, deadline December 15.