As we all know, artificial intelligence is one of the hottest topics today, and the rapid development of computer and internet technologies has pushed research in artificial intelligence to a new climax. Artificial intelligence is an emerging technological science that studies the theories, methods, and applications of simulating and extending human intelligence. Machine learning, as one of the core research areas of artificial intelligence, aims to enable computer systems to possess human learning capabilities to achieve artificial intelligence.
So, what is machine learning?
Machine Learning is a discipline that involves hypothesis modeling for research questions, using computers to learn model parameters from training data, and ultimately predicting and analyzing data.
Uses of Machine Learning
Machine learning is a general data processing technology that encompasses a large number of learning algorithms. Different learning algorithms can exhibit varying performances and advantages across different industries and applications. Currently, machine learning has been successfully applied in the following fields:
Internet Field – Voice recognition, search engines, language translation, spam filtering, natural language processing, etc.
Biological Field – Gene sequence analysis, DNA sequence prediction, protein structure prediction, etc.
Automation Field – Facial recognition, autonomous driving technology, image processing, signal processing, etc.
Financial Field – Securities market analysis, credit card fraud detection, etc.
Medical Field – Disease identification/diagnosis, epidemic outbreak prediction, etc.
Criminal Investigation Field – Potential crime identification and prediction, simulating artificial intelligence detectives, etc.
News Field – News recommendation systems, etc.
Gaming Field – Game strategy planning, etc.
From the applications listed above, it can be seen that machine learning is becoming an analytical tool frequently used across various industries, especially in today’s data explosion across fields. All industries hope to extract valuable information from data through data processing and analysis methods to clarify customer needs and guide enterprise development.
A Brief History of Machine Learning Development
It can be said that machine learning is a relatively novel research branch in the field of artificial intelligence, and its development can generally be divided into four stages.
1
From the mid-1950s to the mid-1960s, this period was a fervent time for machine learning. The research during this stage focused on “learning without knowledge,” i.e., “ignorant” learning, aiming at various self-organizing systems and adaptive systems.
2
From the mid-1960s to the mid-1970s, this period was a calm time for machine learning. The research goal was to simulate the human concept learning process, using logical or graphical structures as internal descriptions of machines. Machines could use symbols to describe concepts and propose various hypotheses about learning concepts.
3
From the mid-1970s to the mid-1980s, this period was known as the renaissance of machine learning. During this stage, researchers expanded from learning single concepts to learning multiple concepts, exploring different learning strategies and various learning methods. This stage began to integrate learning systems with various applications, achieving great success and greatly promoting the development of machine learning.
4
This stage began in 1986. On one hand, due to the resurgence of neural network research, the study of connection mechanism learning methods has been flourishing, and machine learning research has seen a new climax worldwide, strengthening and developing the basic theory and comprehensive systems of machine learning. On the other hand, experimental and application research has received unprecedented attention. The rapid development of artificial intelligence and computer technologies has provided new and stronger research methods and environments for machine learning.
With the development of computer technology and internet technology, machine learning technology has made rapid progress supported by computer hardware. Especially since 2010, international IT giants like Google and Microsoft have accelerated their research on machine learning technology and achieved significant commercial application value. Domestic companies like Alibaba, Baidu, and Tencent have also followed suit, increasing their research efforts in machine learning. Currently, machine learning has achieved some remarkable accomplishments, such as AlphaGo defeating the world Go champion and Microsoft’s AI language understanding ability surpassing that of humans, marking the entry of machine learning technology into a mature application stage.
The Basic Process of Machine Learning Applications
When developing applications using machine learning, the following steps are typically followed.
1
Establish Mathematical Model
Modeling is not an easy task; it requires using various methods and means to collect data, and after obtaining the data, it needs to be entered and preprocessed into an appropriate data format for use in data files.
2
Select Algorithm
There are numerous machine learning algorithms, and multiple algorithms can be used to solve the same problem, with new algorithms continuously emerging. However, different algorithms may have different effects or efficiencies for a specific problem, making the appropriate selection of machine learning algorithms particularly important.
3
Establish Optimization Objective
It can be said that all machine learning problems ultimately translate into an optimization problem, such as minimizing mean square error or maximizing likelihood function, etc.
4
Learning Iteration
Using machine learning algorithms to call the data files generated in Step 1 for self-learning to generate a learning machine model. In this step, an optimization algorithm (such as gradient descent) is usually employed to iteratively update the model parameters, gradually approaching the optimal solution of the objective function.
5
Effect Evaluation
Return to the actual problem to test the algorithm’s performance. If the output results of the algorithm are unsatisfactory, it may be necessary to return to Step 4, further improve the algorithm, and retest. When the issue relates to data collection preparation, it may also require returning to Step 1 to reconsider data screening and preprocessing.
6
Use Algorithm
Convert the machine learning algorithm into an application to verify whether the algorithm can perform its tasks normally in practice.
Realistically, the development of machine learning is primarily the development of algorithms. A detailed overview of the current mainstream machine learning algorithms can be found in the textbook Section 1.3. Generally speaking, in establishing the optimization objective, the log-likelihood function is usually maximized for probabilistic problems, while for regression problems, the squared error is generally used as the loss function, and minimizing this loss function becomes the objective; for classification problems, cross-entropy is often used as the loss function. Whether maximizing the log-likelihood function or minimizing the loss function, there are many optimization methods available, which we will introduce in subsequent chapters.
It is worth noting that model evaluation and the establishment of the objective function can be two different things or the same thing. Model evaluation aims to address the actual problem, while the establishment of the objective function must consider both the solvability of the problem and the practical issue. For example, for classification problems, cross-entropy is usually used as the objective function, but model evaluation may use the AUC (Area Under Curve) metric. Of course, AUC can also be used as an objective function to train the model, but doing so complicates the optimization algorithm, which is clearly not worth the cost.
This article is excerpted from “Machine Learning Algorithms (MATLAB Edition)” with omissions.

Content Summary
This book is an introductory textbook in the field of machine learning, detailing the basic theories and methods of machine learning. The book consists of 15 chapters, including an introduction to machine learning, mathematical foundations, linear models and logistic regression, support vector machines, artificial neural networks, decision tree algorithms, Bayesian algorithms, k-nearest neighbor algorithms, data dimensionality reduction algorithms, clustering algorithms, Gaussian mixture models and EM algorithms, ensemble learning algorithms, maximum entropy algorithms, probabilistic graphical models, and reinforcement learning algorithms. Each machine learning algorithm is introduced from both the theoretical derivation of the algorithm principles and its MATLAB implementation.
The book maintains rigorous theoretical analysis while emphasizing the practicality of machine learning algorithms, highlighting the implementation of the ideas and principles of machine learning algorithms on computers. The content is appropriately selected, systematic, and written in a clear and fluent manner, making it highly readable.
All major algorithms in the chapters are provided with MATLAB programs and corresponding computational examples. To better assist teaching, the author has prepared electronic courseware (PDF format PPT) and MATLAB programs for the major algorithms that interested readers can scan the QR code at the end of each chapter and at the end of the book.
Target Audience
This book is recommended for 60 class hours (including 12 hours of hands-on experiments) and can serve as a textbook or reference book for undergraduate majors in computer science and technology, information and computing science, statistics, and mathematics and applied mathematics, as well as for graduate students in science and engineering taking machine learning courses.
Table of Contents
Chapter 1 Introduction to Machine Learning
1.1 Basic definitions of machine learning
1.2 Basic terminology of machine learning
1.3 Classification of machine learning algorithms
1.3.1 Supervised learning and unsupervised learning
1.3.2 Classification problems and regression problems
1.3.3 Generative models and discriminative models
1.3.4 Reinforcement learning
1.4 Evaluation metrics for machine learning models
1.4.1 Generalization ability of models
1.4.2 Evaluation methods for models
1.4.3 Precision and recall
1.4.4 ROC curve and AUC
1.5 Selection of machine learning models
1.5.1 Regularization techniques
1.5.2 Bias-variance decomposition
1.6 Basic process of machine learning applications
1.7 Uses and development history of machine learning
Chapter 2 Mathematical Foundations
2.1 Matrices and Differentiation
2.1.1 Basic operations of matrices
2.1.2 Derivatives of matrices with respect to scalars
2.1.3 Derivatives of functions of matrix variables
2.1.4 Derivatives of vector functions
2.1.5 Differentiation of matrices and vectors
2.1.6 Eigenvalue decomposition and singular value decomposition
2.2 Optimization Methods
2.2.1 Unconstrained optimization methods
2.2.2 Constrained optimization and KKT conditions
2.2.3 Quadratic programming
2.2.4 Semidefinite programming
2.3 Probability and Statistics
2.3.1 Random variables and probabilities
2.3.2 Conditional probability and independence
2.3.3 Expectation, Markov’s inequality, and moment generating functions
2.3.4 Variance and Chebyshev’s inequality
2.3.5 Sample means and sample variances
2.3.6 Maximum likelihood estimation
2.3.7 Entropy and KL divergence
Chapter 3 Linear Models and Logistic Regression
3.1 Basic forms of linear models
3.1.1 Theoretical basis of linear regression models
3.1.2 MATLAB implementation of linear regression models
3.2 Logistic regression models
3.2.1 Basic principles of logistic regression
3.2.2 MATLAB implementation of logistic regression
3.3 Linear discriminant analysis
3.3.1 Basic principles of linear discriminant analysis
3.3.2 MATLAB implementation of linear discriminant analysis
Chapter 4 Support Vector Machines
4.1 Algorithm principles of support vector machines
4.1.1 Linearly separable problems
4.1.2 Non-linearly separable problems
4.2 Kernel mapping (kernel function) support vector machines
4.3 Principles and derivation of the SMO algorithm
4.3.1 Solving the subproblem
4.3.2 Selection of optimization variables
4.4 Support vector regression models
4.5 MATLAB implementation of support vector machines
Chapter 5 Artificial Neural Networks
5.1 Introduction to feedforward neural networks
5.1.1 M-P neurons
5.1.2 Perceptron model
5.1.3 Multilayer feedforward networks
5.2 Backpropagation algorithm
5.2.1 An example of a single hidden layer network
5.2.2 Backpropagation (BP) algorithm
5.3 Mathematical properties and implementation details of neural networks
5.3.1 Mathematical properties of neural networks
5.3.2 Global minima and local minima
5.3.3 Challenges and implementation details
5.4 MATLAB implementation of neural networks
Chapter 6 Decision Tree Algorithms
6.1 Basic principles of decision tree algorithms
6.1.1 Decision-making process of tree models
6.1.2 Basic framework of decision tree algorithms
6.1.3 Pruning of decision trees
6.2 Improvements to basic decision tree algorithms
6.2.1 Information gain and ID3 decision trees
6.2.2 Gain ratio and C4.5 decision trees
6.2.3 Gini index and CART decision trees
6.3 Handling continuous values and missing attributes
6.3.1 Processing of continuous values
6.3.2 Issues with missing attributes
6.4 MATLAB implementation of decision tree algorithms
Chapter 7 Bayesian Algorithms
7.1 Principles of Bayesian algorithms
7.1.1 Bayesian decision-making
7.1.2 Naive Bayesian algorithm
7.1.3 Normal Bayesian algorithm
7.2 Improvements to Bayesian algorithms
7.2.1 Semi-naive Bayesian algorithm
7.2.2 TAN Bayesian algorithm
7.2.3 Bayesian networks and naive Bayesian trees
7.2.4 Weighted naive Bayesian algorithm
7.3 MATLAB implementation of Bayesian algorithms
Chapter 8 k-Nearest Neighbor Algorithm
8.1 Principles of k-nearest neighbor algorithms
8.1.1 Process of k-nearest neighbor algorithms
8.1.2 Distance functions for k-nearest neighbors
8.2 Overview of k-nearest neighbor improvement algorithms
8.3 MATLAB implementation of k-nearest neighbor algorithms
Chapter 9 Data Dimensionality Reduction Algorithms
9.1 Principal Component Analysis
9.1.1 Basic principles of principal component analysis
9.1.2 Kernel principal component analysis algorithm
9.1.3 MATLAB implementation of PCA algorithm
9.1.4 Fast PCA algorithm and its implementation
9.2 Manifold learning algorithms
9.2.1 Local Linear Embedding and its MATLAB implementation
9.2.2 Isomap and MDS algorithms and their implementation
Chapter 10 Clustering Algorithms
10.1 Basic theory of clustering
10.1.1 Problem definition
10.1.2 Distance calculations
10.1.3 Performance metrics
10.2 k-Means Algorithm
10.2.1 Basic principles of k-means algorithm
10.2.2 MATLAB implementation of k-means algorithm
10.3 k-Center Algorithm
10.3.1 Basic principles of k-center algorithm
10.3.2 MATLAB implementation of k-center algorithm
10.4 Density-Based Clustering Algorithms
10.4.1 Basic principles of DBSCAN algorithm
10.4.2 MATLAB implementation of DBSCAN algorithm
10.5 Hierarchical Clustering Algorithms
10.5.1 Basic principles of AGNES algorithm
10.5.2 MATLAB implementation of AGNES algorithm
Chapter 11 Gaussian Mixture Models and EM Algorithm
11.1 Gaussian mixture models
11.2 Theoretical derivation of EM algorithm
11.3 Applications of EM algorithm
11.4 MATLAB implementation of GMM
11.4.1 Generation of Gaussian mixture models
11.4.2 Parameter fitting of GM models
11.4.3 Gaussian mixture clustering examples
11.5 Improved methods for Gaussian mixture clustering
11.5.1 Issues with setting initial values for fitting
11.5.2 Issues with selecting the number of clusters k
11.5.3 Regularization of Gaussian mixture clustering
Chapter 12 Ensemble Learning Algorithms
12.1 Overview of ensemble learning
12.1.1 Basic concepts of ensemble learning
12.1.2 Parallel generation of ensemble models
12.1.3 Serial generation of ensemble models
12.1.4 Combination strategies for ensemble models
12.2 Bagging and Random Forests
12.2.1 Bagging algorithm
12.2.2 Random Forest algorithm
12.2.3 MATLAB implementation of Random Forests
12.3 Boosting Algorithms
12.3.1 Basic principles of AdaBoost algorithm
12.3.2 MATLAB implementation of AdaBoost
Chapter 13 Maximum Entropy Algorithm
13.1 Origins and related definitions of entropy
13.2 Definition of maximum entropy model
13.2.1 Maximum entropy principle
13.2.2 Maximum entropy model
13.3 Learning algorithms for maximum entropy models
13.3.1 Basic principles of maximum entropy algorithms
13.3.2 Maximum likelihood estimation of maximum entropy models
13.4 Optimization methods for learning model parameters
13.4.1 Gradient descent and quasi-Newton methods
13.4.2 Improved iterative scaling methods
13.5 MATLAB implementation of maximum entropy models
Chapter 14 Probabilistic Graphical Algorithms
14.1 Hidden Markov Models
14.1.1 Definition of Hidden Markov Models
14.1.2 Calculation methods for observation sequence probabilities
14.1.3 Baum-Welch algorithm for calculating model parameters
14.1.4 Viterbi algorithm for predicting hidden state sequences
14.2 Conditional Random Field Models
14.2.1 Definition of Conditional Random Fields
14.2.2 Linear chain conditional random fields
14.2.3 Probability calculations for linear chain conditional random fields
14.2.4 Learning algorithms for linear chain conditional random fields
14.2.5 Decoding algorithms for linear chain conditional random fields
Chapter 15 Reinforcement Learning Algorithms
15.1 Model foundations of reinforcement learning
15.1.1 Basic features of reinforcement learning
15.1.2 Modeling of reinforcement learning
15.2 Theoretical models of reinforcement learning
15.2.1 Exploration and exploitation
15.2.2 ε-greedy algorithm
15.2.3 Softmax algorithm
15.3 Markov Decision Processes
15.4 Dynamic programming algorithms for solving optimal policies
15.4.1 Policy iteration algorithms
15.4.2 Value iteration algorithms
15.5 Monte Carlo methods for solving optimal policies
15.5.1 Definition of model-free reinforcement learning
15.5.2 Monte Carlo methods for solving prediction problems
15.5.3 Monte Carlo methods for solving control problems
15.6 Temporal Difference methods for solving optimal policies
15.6.1 Basic principles of temporal difference methods
15.6.2 SARSA algorithm and MATLAB implementation
15.6.3 Q-learning algorithm and MATLAB implementation
References
Content Display




Swipe left and right to view

Read Science Together!
Science Press│WeChat ID: sciencepress-cspm
Professional Quality Academic Value
Original Good Read Scientific Taste

Science Press Video Account
Hardcore and Informative Audiovisual Science
Spreading science, we welcome you to light up★stars, likes, and views▼