Understanding Machine Learning Through Visuals

Understanding Machine Learning Through Visuals

Source: DeepHub IMBA




This article is about 2300 words long and is recommended for an 8-minute read.
This article introduces the types of machine learning.



Machine Learning

Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.
Machine learning can be divided into three main types based on different task types:
  • Supervised Learning

  • Unsupervised Learning

  • Reinforcement Learning

Understanding Machine Learning Through Visuals

Supervised Learning

Supervised learning is a type of machine learning task that learns a function by training on example input-output pairs to map inputs to outputs. (Requires data labeling, input -> output)
In this type, machine learning algorithms are trained on labeled data. Although this method requires accurately labeled data, supervised learning can be very effective when used appropriately.
Initially, the system receives input data and output data. Its task is to create appropriate rules to map inputs to outputs. The training process should continue until the performance level is sufficiently high.
After training, the system should be able to assign an output object that was not seen during the training phase. In most cases, this process is very fast and accurate.
Types of supervised learning:
  • Regression: The output is a continuous value

  • Classification: The output is a discrete value

Regression
Regression is a supervised machine learning technique used to predict continuous values. For example, we can use it to predict the price of a product, such as the housing price in a city or the value of stocks.
Regression in machine learning consists of mathematical methods that allow data scientists to predict a continuous outcome (y) based on the values of one or more predictor variables (x). Linear regression is perhaps the most popular form of regression analysis because it is easy to use in predictions.
Classification
Classification is a technique aimed at reproducing category assignments. It can predict response values and divide data into “classes.” For example, identifying the type of car in a photo, filtering spam emails, detecting emotions, face recognition, etc.
The three main types of classification are:
  • Binary Classification

It is the process or task of classification where the given data is divided into two classes. It is essentially a prediction about which of the two groups an item belongs to.
Suppose you receive two emails, one from an insurance company that constantly sends advertisements, and another from your bank regarding your credit card bill.
The email service provider will classify the two emails, sending the first to the spam folder and keeping the second in the main inbox. This process is called binary classification because there are two discrete classes: one is spam, and the other is not spam. So this is a binary classification problem.
Algorithms:
  • Logistic Regression

  • KNN

  • Decision Trees/Random Forests/Boosting Trees

  • Support Vector Machines (SVM)

  • Naive Bayes

  • Multi-Layer Perceptron

Understanding Machine Learning Through Visuals
  • Multi-Class Classification

Multi-class classification refers to classification tasks with more than two class labels, where the input data corresponds to only one class label.
Algorithms:
  • KNN

  • Decision Trees/Random Forests/Boosting Trees

  • Naive Bayes

  • Multi-Layer Perceptron

Note: SVM and logistic regression are omitted here because they only support binary classification, but multi-class classification can be achieved through other methods. Generally, models are built equal to the number of classes and perform binary classification. For example, in digit recognition from 0-9, SVM would train 10 binary models, each determining whether it is a 1, whether it is a 2, and so on.
Understanding Machine Learning Through Visuals
  • Multi-Label Classification

Multi-label classification refers to classification tasks with two or more class labels, where each example can predict one or more class labels.
Multi-class can be called single-label multi-class, which is a one-to-one relationship, while multi-label classification is a one-to-many relationship.
In simpler terms, if a photo contains both a cat and a dog, using multi-class would classify the photo into one class, either cat or dog (one-to-one), but for multi-label, it would output both cat and dog (one-to-many).

Unsupervised Learning

Unsupervised learning refers to the use of artificial intelligence (AI) algorithms to identify patterns in datasets containing unlabeled data points. During training, the algorithm classifies, labels, and/or groups the data points contained in the dataset without any external guidance. In other words, unsupervised learning allows the system to identify patterns in the dataset on its own. In unsupervised learning, even without any expected output provided, the model groups information based on similarities and differences.
Unsupervised learning algorithms can perform more complex processing tasks than supervised learning systems.
Types of unsupervised learning:
A. Clustering
Clustering refers to the process of automatically grouping data points with similar characteristics together and assigning them to “clusters.”
Common algorithms:
  • K-Means

  • DBSCAN

  • Gaussian Mixture Model (GMM)

B. Association
Association rule learning is an unsupervised learning technique that examines the dependency of one data item on another within large data, attempting to find interesting relationships or associations between variables in the dataset.
Common algorithms:
  • Apriori Algorithm

  • PCY Algorithm

  • FP-Tree Algorithm

  • XFP-Tree Algorithm

  • GPApriori Algorithm

Applications:
Market analysis: A popular example and application of association rule mining. Large retailers often use this technique to determine associations between products. (Beer and diapers)
Medical diagnosis: Association rules help identify the probability of specific diseases.
Protein sequences: Association rules help determine the synthesis of artificial proteins.

Reinforcement Learning

Reinforcement learning (RL) is a type of machine learning technique that enables agents to learn through feedback from their own actions and experiences in an interactive environment.
Although supervised learning and reinforcement learning both use mappings between inputs and outputs, reinforcement learning uses rewards and penalties as signals for positive and negative behaviors, unlike supervised learning, which provides feedback on the correct set of actions to perform tasks.
Compared to unsupervised learning, reinforcement learning has different objectives. While the goal of unsupervised learning is to find similarities and differences between data points, in the case of reinforcement learning, the goal is to find a suitable action model that maximizes the agent’s total cumulative reward.
Understanding Machine Learning Through Visuals
Some key terms that describe the basic elements of RL problems are:
  1. Environment – The physical world in which the agent operates
  2. Agent – Also known as the intelligent entity, it is the algorithm we write
  3. Action – The actions produced by the agent
  4. State – The state of the agent
  5. Reward – Feedback from the environment, whether good or bad
  6. Policy – The method of mapping the agent’s state to actions, deciding what action to take based on the state
  7. Value – The future reward the agent will receive for taking an action in a specific state

DeepHub Translator’s Note

Machine learning can be categorized into different types based on various classification methods. For example, this article distinguishes it based on task types. For instance, image segmentation is essentially a pixel-level classification task, while object detection primarily aims at bounding box regression.
Regarding implementation methods, we can also categorize them based on different models, such as:
  • Traditional Machine Learning: Various regressions

  • Kernel Methods: SVM, etc.

  • Bayesian Models: Probabilistic correlations

  • Tree Models: Decision Trees, Random Forests, various boosting methods

  • Neural Networks: Multi-Layer Perceptron, various neural networks

The above classifications do not conflict and are cross-cutting. The simplest example is when we use neural networks for classification and regression; the last layer is generally a linear layer (also called a dense layer), which uses linear regression as the algorithm. For example, we can also use neural networks for clustering algorithms, such as deep clustering.
Editor: Wang Jing
Proofreader: Gong Li
Understanding Machine Learning Through Visuals

Leave a Comment