Introduction to Machine Learning

Click on the above “Beginner Learning Vision“, select to add “Starred” or “Pinned“

Heavyweight content delivered at the first time

1. Overview of Machine Learning

Machine learning is the process of acquiring “knowledge” by learning from existing training data and then applying that “knowledge” to new data. The learning process from existing training data can be divided into four steps: (1) calculating the features of the training data, (2) selecting a learning model, such as logistic regression, support vector machines, or decision trees; (3) determining the cost function, where minimizing the cost function corresponds to the best model; different cost functions may yield different optimal models for the same training data; (4) determining evaluation criteria to select the optimal parameters corresponding to the best model (parameter selection). The essence of machine learning is the application of statistical learning, which means that computer systems improve performance through the use of data and statistical methods..

Most machine learning can be divided into supervised learning and unsupervised learning. The difference between supervised and unsupervised learning lies in whether the results of the training samples are known in advance (if nominal, it is classification; if numeric, it is regression).

2. Learning Algorithms

The process of model optimization is to solve the minimization of the cost function, and the learning algorithm is how to achieve the minimization of the cost function.

Recently, I have been learning from Andrew Ng’s online machine learning course, where the most commonly used learning algorithm is the stochastic gradient descent algorithm (such as the cost function minimization for linear regression, logistic regression, and support vector machines explained by Andrew Ng). According to the gradient principle in higher mathematics: the gradient direction of a function at a point is the direction of the fastest increase. Therefore, the gradient descent algorithm chooses the opposite direction of the gradient and iterates according to the step size (learning rate). The iteration ends when the set number of iterations is met or the values before and after the iteration are less than a certain threshold, as shown in the following figure. The gradient descent algorithm is the simplest cost function optimization algorithm among learning algorithms, and subsequent articles on the public account will focus on the theoretical derivation of various learning algorithms and provide Python code.

3. Machine Learning Tasks

The steps to build a machine learning model are quite similar: obtain the features of the training samples to construct the model, and then provide results for new input test data.

Classification

As shown in the figure below, each handwritten digit is a 28×28 pixel image, which can be represented by a vector x that contains 784 dimensions of grayscale images. The input is the feature vector x, and the output is the corresponding digit from 0 to 9. By training a large dataset of handwritten digits, a classification model is constructed. When a new unknown handwritten digit vector x is input, the model provides a classification result from 0 to 9.

Recommendation Systems

As shown in the figure below, the rows in the dataset represent users, and the columns represent items, with 0 indicating that the user has not rated the item. The idea of a recommendation system is: (1) calculate the similarity between the item to be rated and the items already rated by the user, (2) estimate the score of the item to be rated based on the similarity. Since the user’s rating data is a sparse matrix, singular value decomposition (SVD) can map the data to a low-dimensional space, and then apply the recommendation system concept to score the un-rated items in the low-dimensional space.

The application fields of machine learning are very broad, including regression tasks, clustering, word labeling, object detection, etc. It can also be divided into three major categories based on application: one is image processing, another is text processing, and the last is speech processing. For images, this includes tasks like image coloring, detecting faces in images, finding background images, recognizing objects in pictures, and describing an image. For text processing, it includes machine translation, text classification, sentiment analysis, text summarization, and reading comprehension. For speech, it can involve speech recognition and speech generation.

4. Applications of Artificial Intelligence (AI) in the Medical Industry

Recently, I attended an intelligent medicine seminar held in Shenzhen Talent Park, where I shared two viewpoints regarding AI in the medical industry that I strongly agree with: (1) obtaining medical sample data is quite challenging, and AI models in the medical field are often based on small sample data, which poses challenges to the generalization ability of the models; (2) the application of AI in the medical industry is subject to many restrictions. For instance, intelligent cancer screening systems require a judgment of the test subjects before diagnosis; if they do not meet the diagnostic criteria, they will not be tested, which greatly limits the promotion of AI medical products in hospitals. Therefore, a professor from an imaging center in a hospital in Guangzhou stated that in the future, AI will not completely replace manual screening in disease screening systems, but it can play an important role in time-consuming repetitive tasks, allowing doctors to use the saved time for academic research.

References:

“Machine Learning in Action” by Li Rui, Li Peng, et al.

“Pattern Recognition and Machine Learning” by Christopher M. Bishop

“Statistical Learning Methods” by Li Hang

https://www.jianshu.com/p/22998509f00c

https://www.cnblogs.com/pinard/p/5970503.html

Download 1: OpenCV-Contrib Extension Module Chinese Version Tutorial

Reply "Extension Module Chinese Tutorial" in the backend of the "Beginner Learning Vision" public account to download the first OpenCV extension module tutorial in Chinese, covering over twenty chapters on extension module installation, SFM algorithms, stereo vision, object tracking, biological vision, super-resolution processing, etc.

Download 2: Python Vision Practical Project 52 Lectures

Reply "Python Vision Practical Project" in the backend of the "Beginner Learning Vision" public account to download 31 practical vision projects, including image segmentation, mask detection, lane line detection, vehicle counting, eyeliner addition, license plate recognition, character recognition, emotion detection, text content extraction, face recognition, etc., to help quickly learn computer vision.

Download 3: OpenCV Practical Project 20 Lectures

Reply "OpenCV Practical Project 20 Lectures" in the backend of the "Beginner Learning Vision" public account to download 20 practical projects based on OpenCV for advanced learning.

Discussion Group

Welcome to join the reader group of the public account to communicate with peers. Currently, there are WeChat groups for SLAM, 3D vision, sensors, autonomous driving, computational photography, detection, segmentation, recognition, medical imaging, GAN, algorithm competitions, etc. (these will gradually be subdivided). Please scan the WeChat ID below to join the group, and note: "Nickname + School/Company + Research Direction", for example: "Zhang San + Shanghai Jiao Tong University + Vision SLAM". Please follow the format; otherwise, you will not be approved. After successful addition, you will be invited to the relevant WeChat group based on your research direction. Please do not send advertisements in the group; otherwise, you will be removed. Thank you for your understanding~

Leave a Comment Cancel reply