What Are the Differences Between Machine Learning and Deep Learning?

Deep learning has become particularly popular in recent years, much like big data did five years ago. However, deep learning primarily falls within the field of machine learning. In this article, we will discuss the differences in algorithm processes between machine learning and deep learning.

The Algorithm Processes of Machine Learning and Deep Learning

I finally got accepted into an artificial intelligence research program! I don’t know what the differences are between machine learning and deep learning; it feels like everything is deep learning.
Wow, I heard that a senior has been tuning parameters for 10 months to prepare for the T9 model with 200 billion parameters. I want to tune parameters for T10 to aim for Best Paper.

Currently, traditional machine learning-related research papers indeed occupy a small proportion. Some people complain that deep learning is just a systems engineering project with no mathematical depth.

However, it is undeniable that deep learning is incredibly useful! It greatly simplifies the overall algorithm analysis and learning process of traditional machine learning, and more importantly, it has refreshed the accuracy and precision in some common domain tasks that traditional machine learning algorithms could not achieve.

Deep learning has become particularly popular in recent years, much like big data did five years ago. However, deep learning primarily falls within the field of machine learning. In this article, we will discuss the differences in algorithm processes between machine learning and deep learning.

— 01 —

The Algorithm Process of Machine Learning

In fact, machine learning research is all about data science (which sounds a bit dull). Here are the main processes of machine learning algorithms:

(1) Dataset preparation

(2) Exploratory analysis of the data

(3) Data preprocessing

(4) Data splitting

(5) Building machine learning algorithm models

(6) Selecting machine learning tasks

(7) Finally, evaluating how well the machine learning algorithms perform on actual data

1.1 Dataset

First, we need to study the data issue. The dataset is the starting point for building machine learning models. In simple terms, a dataset is essentially an M×N matrix, where M represents columns (features) and N represents rows (samples).

Columns can be broken down into X and Y, where X can refer to features, independent variables, or input variables. Y can refer to class labels, dependent variables, and output variables.

1.2 Data Analysis

Conducting exploratory data analysis (EDA) is to gain a preliminary understanding of the data. The main tasks of EDA include: cleaning the data, describing the data (descriptive statistics, charts), checking the data distribution, comparing relationships between data, developing intuition about the data, summarizing the data, etc.

The goal of exploratory data analysis is to understand the data, analyze the data, and clarify the data distribution. It mainly focuses on the true distribution of the data, emphasizes data visualization, allowing analysts to clearly see the hidden patterns within the data, thus gaining insights to help find suitable models for the data.

In a typical machine learning algorithm process and data science project, the first thing I do is “focus on the data” to better understand it. The three main EDA methods I typically use include:

Descriptive Statistics

Mean, median, mode, standard deviation.

Data Visualization

Heatmap (to identify internal correlations of features), box plot (to visualize group differences), scatter plot (to visualize correlations between features), principal component analysis (to visualize the clustering distribution presented in the dataset), etc.