What Is Machine Learning?

As we all know, artificial intelligence is one of the hottest topics today, and the rapid development of computer and internet technologies has pushed research in artificial intelligence to a new climax. Artificial intelligence is an emerging technological science that studies the theories, methods, and applications of simulating and extending human intelligence. Machine learning, as one of the core research areas of artificial intelligence, aims to enable computer systems to possess human learning capabilities to achieve artificial intelligence.

So, what is machine learning?

Machine Learning is a discipline that involves hypothesis modeling for research questions, using computers to learn model parameters from training data, and ultimately predicting and analyzing data.

Uses of Machine Learning

Machine learning is a general data processing technology that encompasses a large number of learning algorithms. Different learning algorithms can exhibit varying performances and advantages across different industries and applications. Currently, machine learning has been successfully applied in the following fields:

Internet Field – Voice recognition, search engines, language translation, spam filtering, natural language processing, etc.

Biological Field – Gene sequence analysis, DNA sequence prediction, protein structure prediction, etc.

Automation Field – Facial recognition, autonomous driving technology, image processing, signal processing, etc.

Financial Field – Securities market analysis, credit card fraud detection, etc.

Medical Field – Disease identification/diagnosis, epidemic outbreak prediction, etc.

Criminal Investigation Field – Potential crime identification and prediction, simulating artificial intelligence detectives, etc.

News Field – News recommendation systems, etc.

Gaming Field – Game strategy planning, etc.

From the applications listed above, it can be seen that machine learning is becoming an analytical tool frequently used across various industries, especially in today’s data explosion across fields. All industries hope to extract valuable information from data through data processing and analysis methods to clarify customer needs and guide enterprise development.

A Brief History of Machine Learning Development

It can be said that machine learning is a relatively novel research branch in the field of artificial intelligence, and its development can generally be divided into four stages.

From the mid-1950s to the mid-1960s, this period was a fervent time for machine learning. The research during this stage focused on “learning without knowledge,” i.e., “ignorant” learning, aiming at various self-organizing systems and adaptive systems.

From the mid-1960s to the mid-1970s, this period was a calm time for machine learning. The research goal was to simulate the human concept learning process, using logical or graphical structures as internal descriptions of machines. Machines could use symbols to describe concepts and propose various hypotheses about learning concepts.

From the mid-1970s to the mid-1980s, this period was known as the renaissance of machine learning. During this stage, researchers expanded from learning single concepts to learning multiple concepts, exploring different learning strategies and various learning methods. This stage began to integrate learning systems with various applications, achieving great success and greatly promoting the development of machine learning.

This stage began in 1986. On one hand, due to the resurgence of neural network research, the study of connection mechanism learning methods has been flourishing, and machine learning research has seen a new climax worldwide, strengthening and developing the basic theory and comprehensive systems of machine learning. On the other hand, experimental and application research has received unprecedented attention. The rapid development of artificial intelligence and computer technologies has provided new and stronger research methods and environments for machine learning.

With the development of computer technology and internet technology, machine learning technology has made rapid progress supported by computer hardware. Especially since 2010, international IT giants like Google and Microsoft have accelerated their research on machine learning technology and achieved significant commercial application value. Domestic companies like Alibaba, Baidu, and Tencent have also followed suit, increasing their research efforts in machine learning. Currently, machine learning has achieved some remarkable accomplishments, such as AlphaGo defeating the world Go champion and Microsoft’s AI language understanding ability surpassing that of humans, marking the entry of machine learning technology into a mature application stage.

The Basic Process of Machine Learning Applications

When developing applications using machine learning, the following steps are typically followed.

Establish Mathematical Model

Modeling is not an easy task; it requires using various methods and means to collect data, and after obtaining the data, it needs to be entered and preprocessed into an appropriate data format for use in data files.

Select Algorithm

There are numerous machine learning algorithms, and multiple algorithms can be used to solve the same problem, with new algorithms continuously emerging. However, different algorithms may have different effects or efficiencies for a specific problem, making the appropriate selection of machine learning algorithms particularly important.

Establish Optimization Objective

It can be said that all machine learning problems ultimately translate into an optimization problem, such as minimizing mean square error or maximizing likelihood function, etc.

Learning Iteration

Using machine learning algorithms to call the data files generated in Step 1 for self-learning to generate a learning machine model. In this step, an optimization algorithm (such as gradient descent) is usually employed to iteratively update the model parameters, gradually approaching the optimal solution of the objective function.

Effect Evaluation

Return to the actual problem to test the algorithm’s performance. If the output results of the algorithm are unsatisfactory, it may be necessary to return to Step 4, further improve the algorithm, and retest. When the issue relates to data collection preparation, it may also require returning to Step 1 to reconsider data screening and preprocessing.

Use Algorithm

Convert the machine learning algorithm into an application to verify whether the algorithm can perform its tasks normally in practice.

Realistically, the development of machine learning is primarily the development of algorithms. A detailed overview of the current mainstream machine learning algorithms can be found in the textbook Section 1.3. Generally speaking, in establishing the optimization objective, the log-likelihood function is usually maximized for probabilistic problems, while for regression problems, the squared error is generally used as the loss function, and minimizing this loss function becomes the objective; for classification problems, cross-entropy is often used as the loss function. Whether maximizing the log-likelihood function or minimizing the loss function, there are many optimization methods available, which we will introduce in subsequent chapters.

It is worth noting that model evaluation and the establishment of the objective function can be two different things or the same thing. Model evaluation aims to address the actual problem, while the establishment of the objective function must consider both the solvability of the problem and the practical issue. For example, for classification problems, cross-entropy is usually used as the objective function, but model evaluation may use the AUC (Area Under Curve) metric. Of course, AUC can also be used as an objective function to train the model, but doing so complicates the optimization algorithm, which is clearly not worth the cost.

This article is excerpted from “Machine Learning Algorithms (MATLAB Edition)” with omissions.

Content Summary

This book is an introductory textbook in the field of machine learning, detailing the basic theories and methods of machine learning. The book consists of 15 chapters, including an introduction to machine learning, mathematical foundations, linear models and logistic regression, support vector machines, artificial neural networks, decision tree algorithms, Bayesian algorithms, k-nearest neighbor algorithms, data dimensionality reduction algorithms, clustering algorithms, Gaussian mixture models and EM algorithms, ensemble learning algorithms, maximum entropy algorithms, probabilistic graphical models, and reinforcement learning algorithms. Each machine learning algorithm is introduced from both the theoretical derivation of the algorithm principles and its MATLAB implementation.

The book maintains rigorous theoretical analysis while emphasizing the practicality of machine learning algorithms, highlighting the implementation of the ideas and principles of machine learning algorithms on computers. The content is appropriately selected, systematic, and written in a clear and fluent manner, making it highly readable.

All major algorithms in the chapters are provided with MATLAB programs and corresponding computational examples. To better assist teaching, the author has prepared electronic courseware (PDF format PPT) and MATLAB programs for the major algorithms that interested readers can scan the QR code at the end of each chapter and at the end of the book.

Target Audience

This book is recommended for 60 class hours (including 12 hours of hands-on experiments) and can serve as a textbook or reference book for undergraduate majors in computer science and technology, information and computing science, statistics, and mathematics and applied mathematics, as well as for graduate students in science and engineering taking machine learning courses.

Table of Contents

Chapter 1 Introduction to Machine Learning

1.1 Basic definitions of machine learning

1.2 Basic terminology of machine learning

1.3 Classification of machine learning algorithms

1.3.1 Supervised learning and unsupervised learning

1.3.2 Classification problems and regression problems

1.3.3 Generative models and discriminative models

1.3.4 Reinforcement learning

1.4 Evaluation metrics for machine learning models

1.4.1 Generalization ability of models

1.4.2 Evaluation methods for models

1.4.3 Precision and recall

1.4.4 ROC curve and AUC

1.5 Selection of machine learning models

1.5.1 Regularization techniques

1.5.2 Bias-variance decomposition

1.6 Basic process of machine learning applications

1.7 Uses and development history of machine learning

Chapter 2 Mathematical Foundations

2.1 Matrices and Differentiation

2.1.1 Basic operations of matrices

2.1.2 Derivatives of matrices with respect to scalars

2.1.3 Derivatives of functions of matrix variables

2.1.4 Derivatives of vector functions

2.1.5 Differentiation of matrices and vectors

2.1.6 Eigenvalue decomposition and singular value decomposition

2.2 Optimization Methods

2.2.1 Unconstrained optimization methods

2.2.2 Constrained optimization and KKT conditions

2.2.3 Quadratic programming

2.2.4 Semidefinite programming

2.3 Probability and Statistics

2.3.1 Random variables and probabilities

2.3.2 Conditional probability and independence

2.3.3 Expectation, Markov’s inequality, and moment generating functions

2.3.4 Variance and Chebyshev’s inequality

2.3.5 Sample means and sample variances

2.3.6 Maximum likelihood estimation

2.3.7 Entropy and KL divergence

Chapter 3 Linear Models and Logistic Regression

3.1 Basic forms of linear models

3.1.1 Theoretical basis of linear regression models

3.1.2 MATLAB implementation of linear regression models

3.2 Logistic regression models

3.2.1 Basic principles of logistic regression

3.2.2 MATLAB implementation of logistic regression

3.3 Linear discriminant analysis

3.3.1 Basic principles of linear discriminant analysis

3.3.2 MATLAB implementation of linear discriminant analysis

Chapter 4 Support Vector Machines

4.1 Algorithm principles of support vector machines

4.1.1 Linearly separable problems

4.1.2 Non-linearly separable problems

4.2 Kernel mapping (kernel function) support vector machines

4.3 Principles and derivation of the SMO algorithm

4.3.1 Solving the subproblem

4.3.2 Selection of optimization variables

4.4 Support vector regression models

4.5 MATLAB implementation of support vector machines

Chapter 5 Artificial Neural Networks

5.1 Introduction to feedforward neural networks

5.1.1 M-P neurons

5.1.2 Perceptron model

5.1.3 Multilayer feedforward networks

5.2 Backpropagation algorithm

5.2.1 An example of a single hidden layer network

5.2.2 Backpropagation (BP) algorithm

5.3 Mathematical properties and implementation details of neural networks

5.3.1 Mathematical properties of neural networks

5.3.2 Global minima and local minima

5.3.3 Challenges and implementation details

5.4 MATLAB implementation of neural networks

Chapter 6 Decision Tree Algorithms

6.1 Basic principles of decision tree algorithms

6.1.1 Decision-making process of tree models

6.1.2 Basic framework of decision tree algorithms

6.1.3 Pruning of decision trees

6.2 Improvements to basic decision tree algorithms

6.2.1 Information gain and ID3 decision trees

6.2.2 Gain ratio and C4.5 decision trees

6.2.3 Gini index and CART decision trees

6.3 Handling continuous values and missing attributes

6.3.1 Processing of continuous values

6.3.2 Issues with missing attributes

6.4 MATLAB implementation of decision tree algorithms

Chapter 7 Bayesian Algorithms

7.1 Principles of Bayesian algorithms

7.1.1 Bayesian decision-making

7.1.2 Naive Bayesian algorithm

7.1.3 Normal Bayesian algorithm

7.2 Improvements to Bayesian algorithms

7.2.1 Semi-naive Bayesian algorithm

7.2.2 TAN Bayesian algorithm

7.2.3 Bayesian networks and naive Bayesian trees

7.2.4 Weighted naive Bayesian algorithm

7.3 MATLAB implementation of Bayesian algorithms

Chapter 8 k-Nearest Neighbor Algorithm

8.1 Principles of k-nearest neighbor algorithms

8.1.1 Process of k-nearest neighbor algorithms

8.1.2 Distance functions for k-nearest neighbors

8.2 Overview of k-nearest neighbor improvement algorithms

8.3 MATLAB implementation of k-nearest neighbor algorithms

Chapter 9 Data Dimensionality Reduction Algorithms

9.1 Principal Component Analysis

9.1.1 Basic principles of principal component analysis

9.1.2 Kernel principal component analysis algorithm

9.1.3 MATLAB implementation of PCA algorithm

9.1.4 Fast PCA algorithm and its implementation

9.2 Manifold learning algorithms

9.2.1 Local Linear Embedding and its MATLAB implementation

9.2.2 Isomap and MDS algorithms and their implementation

Chapter 10 Clustering Algorithms

10.1 Basic theory of clustering

10.1.1 Problem definition

10.1.2 Distance calculations

10.1.3 Performance metrics

10.2 k-Means Algorithm

10.2.1 Basic principles of k-means algorithm

10.2.2 MATLAB implementation of k-means algorithm

10.3 k-Center Algorithm

10.3.1 Basic principles of k-center algorithm

10.3.2 MATLAB implementation of k-center algorithm

10.4 Density-Based Clustering Algorithms

10.4.1 Basic principles of DBSCAN algorithm

10.4.2 MATLAB implementation of DBSCAN algorithm

10.5 Hierarchical Clustering Algorithms

10.5.1 Basic principles of AGNES algorithm

10.5.2 MATLAB implementation of AGNES algorithm

Chapter 11 Gaussian Mixture Models and EM Algorithm

11.1 Gaussian mixture models

11.2 Theoretical derivation of EM algorithm

11.3 Applications of EM algorithm

11.4 MATLAB implementation of GMM

11.4.1 Generation of Gaussian mixture models

11.4.2 Parameter fitting of GM models

11.4.3 Gaussian mixture clustering examples

11.5 Improved methods for Gaussian mixture clustering

11.5.1 Issues with setting initial values for fitting

11.5.2 Issues with selecting the number of clusters k

11.5.3 Regularization of Gaussian mixture clustering

Chapter 12 Ensemble Learning Algorithms

12.1 Overview of ensemble learning

12.1.1 Basic concepts of ensemble learning

12.1.2 Parallel generation of ensemble models

12.1.3 Serial generation of ensemble models

12.1.4 Combination strategies for ensemble models

12.2 Bagging and Random Forests

12.2.1 Bagging algorithm

12.2.2 Random Forest algorithm

12.2.3 MATLAB implementation of Random Forests

12.3 Boosting Algorithms

12.3.1 Basic principles of AdaBoost algorithm

12.3.2 MATLAB implementation of AdaBoost

Chapter 13 Maximum Entropy Algorithm

13.1 Origins and related definitions of entropy

13.2 Definition of maximum entropy model

13.2.1 Maximum entropy principle

13.2.2 Maximum entropy model

13.3 Learning algorithms for maximum entropy models

13.3.1 Basic principles of maximum entropy algorithms

13.3.2 Maximum likelihood estimation of maximum entropy models

13.4 Optimization methods for learning model parameters

13.4.1 Gradient descent and quasi-Newton methods

13.4.2 Improved iterative scaling methods

13.5 MATLAB implementation of maximum entropy models

Chapter 14 Probabilistic Graphical Algorithms

14.1 Hidden Markov Models

14.1.1 Definition of Hidden Markov Models

14.1.2 Calculation methods for observation sequence probabilities

14.1.3 Baum-Welch algorithm for calculating model parameters

14.1.4 Viterbi algorithm for predicting hidden state sequences

14.2 Conditional Random Field Models

14.2.1 Definition of Conditional Random Fields

14.2.2 Linear chain conditional random fields

14.2.3 Probability calculations for linear chain conditional random fields

14.2.4 Learning algorithms for linear chain conditional random fields

14.2.5 Decoding algorithms for linear chain conditional random fields

Chapter 15 Reinforcement Learning Algorithms

15.1 Model foundations of reinforcement learning

15.1.1 Basic features of reinforcement learning

15.1.2 Modeling of reinforcement learning

15.2 Theoretical models of reinforcement learning

15.2.1 Exploration and exploitation

15.2.2 ε-greedy algorithm

15.2.3 Softmax algorithm

15.3 Markov Decision Processes

15.4 Dynamic programming algorithms for solving optimal policies

15.4.1 Policy iteration algorithms

15.4.2 Value iteration algorithms

15.5 Monte Carlo methods for solving optimal policies

15.5.1 Definition of model-free reinforcement learning

15.5.2 Monte Carlo methods for solving prediction problems

15.5.3 Monte Carlo methods for solving control problems

15.6 Temporal Difference methods for solving optimal policies

15.6.1 Basic principles of temporal difference methods

15.6.2 SARSA algorithm and MATLAB implementation

15.6.3 Q-learning algorithm and MATLAB implementation

References

Content Display

Swipe left and right to view

Source: Science Press Mathematics Education

What Is Machine Learning?

Read Science Together!

Science Press│WeChat ID: sciencepress-cspm

Professional Quality Academic Value

Original Good Read Scientific Taste

Science Press Video Account

Hardcore and Informative Audiovisual Science

Spreading science, we welcome you to light up★stars, likes, and views▼

Leave a Comment Cancel reply