
Source: Machine Learning Academy
This article is about 10,500 words long, and it is recommended to read for over 20 minutes.
In this article, we will briefly discuss the latest advances in deep learning in recent years.
“Overview is always one of the quickest ways to get started in a new field!”

1 Introduction
2 Related Research
3 Recent Advances
3.1 Evolution of Deep Architectures
4 Deep Learning Methods
4.1 Deep Supervised Learning
4.2 Deep Unsupervised Learning
4.3 Deep Reinforcement Learning
5 Deep Neural Networks
5.1 Deep Autoencoders
5.1.1 Variational Autoencoders
5.1.2 Multi-layer Denoising Autoencoders
5.1.3 Transformable Autoencoders
5.2 Deep Convolutional Neural Networks
5.2.1 Deep Max-Pooling Convolutional Neural Networks
5.2.2 Very Deep Convolutional Neural Networks
5.3 Network In Network
5.4 Region-based Convolutional Neural Networks
5.4.1 Fast R-CNN
5.4.2 Faster R-CNN
5.4.3 Mask R-CNN
5.4.4 Multi-Expert R-CNN
5.5 Deep Residual Networks
5.5.1 ResNet in ResNet
5.5.2 ResNeXt
5.6 Capsule Networks
5.7 Recurrent Neural Networks
5.7.1 RNN-EM
5.7.2 GF-RNN
5.7.3 CRF-RNN
5.7.4 Quasi-RNN
5.8 Memory Networks
5.8.1 Dynamic Memory Networks
5.9 Enhanced Neural Networks
5.9.1 Neural Turing Machines
5.9.2 Neural GPUs
5.9.3 Neural Random Access Machines
5.9.4 Neural Programmers
5.9.5 Neural Programmer-Interpreter
5.10 Long Short-Term Memory Networks
5.10.1 Batch-Normalized LSTM
5.10.2 Pixel RNN
5.10.3 Bidirectional LSTM
5.10.4 Variational Bi-LSTM
5.11 Google Neural Machine Translation
5.12 Fader Networks
5.13 Hyper Networks
5.14 Highway Networks
5.14.1 Recurrent Highway Networks
5.15 Highway LSTM RNN
5.16 Long-Term Recurrent CNN
5.17 Deep Neural SVM
5.18 Convolutional Residual Memory Networks
Moniz and Pal (2016) proposed Convolutional Residual Memory Networks, incorporating memory mechanisms into convolutional neural networks (CNN). It enhances convolutional residual networks with a long short-term memory mechanism.
5.19 Fractal Networks
5.20 WaveNet
5.21 Pointer Networks
6 Deep Generative Models
6.1 Boltzmann Machines
6.2 Restricted Boltzmann Machines
Restricted Boltzmann Machines (RBM) are a special type of Markov random field that contains a layer of random hidden units, i.e., latent variables, and a layer of observable variables.
Hinton and Salakhutdinov (2011) proposed a deep generative model for document processing using restricted Boltzmann machines (RBM).
6.3 Deep Belief Networks
Deep Belief Networks (DBN) are generative models with multiple layers of latent binary or real-valued variables.
Ranzato et al. (2011) established a deep generative model for image recognition using Deep Belief Networks (DBN).
6.4 Deep Lambertian Networks
Tang et al. (2012) proposed Deep Lambertian Networks (DLN), a multi-layer generative model where the latent variables are reflectance, surface normals, and light sources. DLN is a combination of Lambertian reflectance with Gaussian restricted Boltzmann machines and deep belief networks.
6.5 Generative Adversarial Networks
Goodfellow et al. (2014) proposed Generative Adversarial Networks (GAN), which evaluate generative models through adversarial processes. The GAN architecture consists of a generative model that targets an adversary (i.e., a learning model or a discriminative model of data distribution). Mao et al. (2016), Kim et al. (2017) proposed further improvements to GAN.
Salimans et al. (2016) proposed several methods for training GANs.
6.5.1 Laplacian Generative Adversarial Networks
Denton et al. (2015) proposed a deep generative model (DGM) called Laplacian Generative Adversarial Networks (LAPGAN), using the generative adversarial network (GAN) method. This model also utilizes convolutional networks within the Laplacian pyramid framework.
6.6 Recurrent Support Vector Machines
Shi et al. (2016a) proposed Recurrent Support Vector Machines (RSVM), utilizing recurrent neural networks (RNN) to extract features from input sequences for sequence-level target recognition using standard support vector machines (SVM).
7 Training and Optimization Techniques
In this section, we will briefly outline some key techniques used for regularization and optimization of deep neural networks (DNN).
7.1 Dropout
(Extended methods like dropconnect, etc. … too many)
Srivastava et al. (2014) proposed Dropout to prevent overfitting in neural networks. Dropout is a model averaging regularization method for neural networks, adding noise to its hidden units. During training, it randomly samples units and connections from the neural network. Dropout can be applied to graphical models like RBM (Srivastava et al., 2014) and any type of neural network. A recent improvement on Dropout is Fraternal Dropout, used for recurrent neural networks (RNN).
7.2 Maxout
Goodfellow et al. (2013) proposed Maxout, a new activation function for Dropout. Maxout outputs the maximum value of a set of inputs, benefiting the model averaging of Dropout.
7.3 Zoneout
Krueger et al. (2016) proposed Zoneout, a regularization method for recurrent neural networks (RNN). Zoneout randomly uses noise during training, similar to Dropout, but retains the hidden units instead of dropping them.
7.4 Deep Residual Learning
He et al. (2015) proposed a deep residual learning framework known as ResNet with low training errors.
7.5 Batch Normalization
(Including various variants of Bn and bnd…)
Ioffe and Szegedy (2015) proposed Batch Normalization, a method to accelerate deep neural network training by reducing internal covariate shift. Ioffe (2017) proposed Batch Re-normalization, extending previous methods.
7.6 Distillation
Hinton et al. (2015) proposed a method for transferring knowledge from a highly regularized model ensemble (i.e., neural networks) to a compressed small model.
7.7 Layer Normalization
Ba et al. (2016) proposed Layer Normalization, specifically targeting deep neural networks for RNN to accelerate training, addressing the limitations of batch normalization.
8 Deep Learning Frameworks
There are numerous open-source libraries and frameworks available for deep learning. Most of them are built for the Python programming language, such as Theano, TensorFlow, PyTorch, PyBrain, Caffe, Blocks and Fuel, CuDNN, Honk, ChainerCV, PyLearn2, Chainer, torch, etc.
9 Applications of Deep Learning
In this section, we will briefly discuss some outstanding recent applications in deep learning. Since the inception of deep learning (DL), DL methods have been widely applied in various fields in the form of supervised, unsupervised, semi-supervised, or reinforcement learning. Starting from classification and detection tasks, DL applications are rapidly expanding into every domain.
For example:
-
Image classification and recognition
-
Video classification
-
Sequence generation
-
Defect classification
-
Text, speech, image, and video processing
-
Text classification
-
Speech processing
-
Speech recognition and understanding
-
Text-to-speech generation
-
Query classification
-
Sentence classification
-
Sentence modeling
-
Vocabulary processing
-
Pre-selection
-
Document and sentence processing
-
Generating image captions
-
Photo style transfer
-
Natural image manifolds
-
Image colorization
-
Image question answering
-
Generating textures and stylized images
-
Visual and text question answering
-
Visual recognition and description
-
Object recognition
-
Document processing
-
Character action synthesis and editing
-
Song synthesis
-
Identity recognition
-
Face recognition and verification
-
Video action recognition
-
Human action recognition
-
Action recognition
-
Classification and visualization of motion capture sequences
-
Handwriting generation and prediction
-
Automation and machine translation
-
Named entity recognition
-
Mobile vision
-
Conversational agents
-
Genetic variation calls
-
Cancer detection
-
X-ray CT reconstruction
-
Seizure prediction
-
Hardware acceleration
-
Robotics
etc.
Deng and Yu (2014) provided a detailed list of DL applications in speech processing, information retrieval, object recognition, computer vision, multimodal, and multi-task learning.
Using deep reinforcement learning (Deep Reinforcement Learning, DRL) to master games has become a hot topic today. Currently, artificial intelligence robots are created using DNN and DRL, defeating human world champions and chess masters in strategic and other games, starting from just a few hours of training. For example, AlphaGo and AlphaGo Zero in Go.
10 Discussion
Despite the tremendous success of deep learning in many fields, there is still a long way to go. Many areas remain to be improved. Regarding limitations, there are quite a few examples. For instance, Nguyen et al. showed that deep neural networks (DNN) are easily deceived when recognizing images. Other issues, such as the transferability of learned features proposed by Yosinski et al., also exist. Huang et al. proposed an architecture for defending against neural network attacks, suggesting that future work needs to defend against these attacks. Zhang et al. proposed an experimental framework for understanding deep learning models, arguing that understanding deep learning requires rethinking and generalization.
Marcus provided an important review in 2018 on the role, limitations, and essence of deep learning (Deep Learning, DL). He strongly pointed out the limitations of DL methods, such as the need for more data, limited capacity, inability to handle hierarchical structures, lack of open-ended reasoning, insufficient transparency, inability to integrate with prior knowledge, and inability to distinguish causality. He also mentioned that DL assumes a stable world, achieving approximate methods, is difficult to engineer, and poses potential risks of overhyping. Marcus believes that DL needs to be reconceptualized and seeks possibilities in unsupervised learning, symbolic operations, and hybrid models, gaining insights from cognitive science and psychology, and embracing bolder challenges.
11 Conclusion
Despite the rapid advancement of deep learning (DL) in propelling the world forward, many aspects are still worth researching. We still do not fully understand deep learning and how to make machines smarter, closer to or smarter than humans, or learn like humans. DL has been solving many problems while applying technology to various aspects. However, humanity still faces many challenges, such as starvation and food crises, cancer, and other fatal diseases. We hope that deep learning and artificial intelligence will be more committed to improving human quality of life by undertaking the most challenging scientific research. Last but not least, may our world become a better place.
There are some omissions, but overall it is summarized well, nice.
6.3 Deep Belief Networks
Deep Belief Networks (DBN) are generative models with multiple layers of latent binary or real-valued variables.
Ranzato et al. (2011) established a deep generative model for image recognition using Deep Belief Networks (DBN).
6.4 Deep Lambertian Networks
Tang et al. (2012) proposed Deep Lambertian Networks (DLN), a multi-layer generative model where the latent variables are reflectance, surface normals, and light sources. DLN is a combination of Lambertian reflectance with Gaussian restricted Boltzmann machines and deep belief networks.
6.5 Generative Adversarial Networks
Goodfellow et al. (2014) proposed Generative Adversarial Networks (GAN), which evaluate generative models through adversarial processes. The GAN architecture consists of a generative model that targets an adversary (i.e., a learning model or a discriminative model of data distribution). Mao et al. (2016), Kim et al. (2017) proposed further improvements to GAN.
Salimans et al. (2016) proposed several methods for training GANs.
6.5.1 Laplacian Generative Adversarial Networks
Denton et al. (2015) proposed a deep generative model (DGM) called Laplacian Generative Adversarial Networks (LAPGAN), using the generative adversarial network (GAN) method. This model also utilizes convolutional networks within the Laplacian pyramid framework.
6.6 Recurrent Support Vector Machines
Shi et al. (2016a) proposed Recurrent Support Vector Machines (RSVM), utilizing recurrent neural networks (RNN) to extract features from input sequences for sequence-level target recognition using standard support vector machines (SVM).
7 Training and Optimization Techniques
In this section, we will briefly outline some key techniques used for regularization and optimization of deep neural networks (DNN).
7.1 Dropout
(Extended methods like dropconnect, etc. … too many)
Srivastava et al. (2014) proposed Dropout to prevent overfitting in neural networks. Dropout is a model averaging regularization method for neural networks, adding noise to its hidden units. During training, it randomly samples units and connections from the neural network. Dropout can be applied to graphical models like RBM (Srivastava et al., 2014) and any type of neural network. A recent improvement on Dropout is Fraternal Dropout, used for recurrent neural networks (RNN).
7.2 Maxout
Goodfellow et al. (2013) proposed Maxout, a new activation function for Dropout. Maxout outputs the maximum value of a set of inputs, benefiting the model averaging of Dropout.
7.3 Zoneout
Krueger et al. (2016) proposed Zoneout, a regularization method for recurrent neural networks (RNN). Zoneout randomly uses noise during training, similar to Dropout, but retains the hidden units instead of dropping them.
7.4 Deep Residual Learning
He et al. (2015) proposed a deep residual learning framework known as ResNet with low training errors.
7.5 Batch Normalization
(Including various variants of Bn and bnd…)
Ioffe and Szegedy (2015) proposed Batch Normalization, a method to accelerate deep neural network training by reducing internal covariate shift. Ioffe (2017) proposed Batch Re-normalization, extending previous methods.
7.6 Distillation
Hinton et al. (2015) proposed a method for transferring knowledge from a highly regularized model ensemble (i.e., neural networks) to a compressed small model.
7.7 Layer Normalization
Ba et al. (2016) proposed Layer Normalization, specifically targeting deep neural networks for RNN to accelerate training, addressing the limitations of batch normalization.
8 Deep Learning Frameworks
There are numerous open-source libraries and frameworks available for deep learning. Most of them are built for the Python programming language, such as Theano, TensorFlow, PyTorch, PyBrain, Caffe, Blocks and Fuel, CuDNN, Honk, ChainerCV, PyLearn2, Chainer, torch, etc.
9 Applications of Deep Learning
In this section, we will briefly discuss some outstanding recent applications in deep learning. Since the inception of deep learning (DL), DL methods have been widely applied in various fields in the form of supervised, unsupervised, semi-supervised, or reinforcement learning. Starting from classification and detection tasks, DL applications are rapidly expanding into every domain.
For example:
-
Image classification and recognition
-
Video classification
-
Sequence generation
-
Defect classification
-
Text, speech, image, and video processing
-
Text classification
-
Speech processing
-
Speech recognition and understanding
-
Text-to-speech generation
-
Query classification
-
Sentence classification
-
Sentence modeling
-
Vocabulary processing
-
Pre-selection
-
Document and sentence processing
-
Generating image captions
-
Photo style transfer
-
Natural image manifolds
-
Image colorization
-
Image question answering
-
Generating textures and stylized images
-
Visual and text question answering
-
Visual recognition and description
-
Object recognition
-
Document processing
-
Character action synthesis and editing
-
Song synthesis
-
Identity recognition
-
Face recognition and verification
-
Video action recognition
-
Human action recognition
-
Action recognition
-
Classification and visualization of motion capture sequences
-
Handwriting generation and prediction
-
Automation and machine translation
-
Named entity recognition
-
Mobile vision
-
Conversational agents
-
Genetic variation calls
-
Cancer detection
-
X-ray CT reconstruction
-
Seizure prediction
-
Hardware acceleration
-
Robotics
etc.
Deng and Yu (2014) provided a detailed list of DL applications in speech processing, information retrieval, object recognition, computer vision, multimodal, and multi-task learning.
Using deep reinforcement learning (Deep Reinforcement Learning, DRL) to master games has become a hot topic today. Currently, artificial intelligence robots are created using DNN and DRL, defeating human world champions and chess masters in strategic and other games, starting from just a few hours of training. For example, AlphaGo and AlphaGo Zero in Go.
10 Discussion
Despite the tremendous success of deep learning in many fields, there is still a long way to go. Many areas remain to be improved. Regarding limitations, there are quite a few examples. For instance, Nguyen et al. showed that deep neural networks (DNN) are easily deceived when recognizing images. Other issues, such as the transferability of learned features proposed by Yosinski et al., also exist. Huang et al. proposed an architecture for defending against neural network attacks, suggesting that future work needs to defend against these attacks. Zhang et al. proposed an experimental framework for understanding deep learning models, arguing that understanding deep learning requires rethinking and generalization.
Marcus provided an important review in 2018 on the role, limitations, and essence of deep learning (Deep Learning, DL). He strongly pointed out the limitations of DL methods, such as the need for more data, limited capacity, inability to handle hierarchical structures, lack of open-ended reasoning, insufficient transparency, inability to integrate with prior knowledge, and inability to distinguish causality. He also mentioned that DL assumes a stable world, achieving approximate methods, is difficult to engineer, and poses potential risks of overhyping. Marcus believes that DL needs to be reconceptualized and seeks possibilities in unsupervised learning, symbolic operations, and hybrid models, gaining insights from cognitive science and psychology, and embracing bolder challenges.
11 Conclusion
Despite the rapid advancement of deep learning (DL) in propelling the world forward, many aspects are still worth researching. We still do not fully understand deep learning and how to make machines smarter, closer to or smarter than humans, or learn like humans. DL has been solving many problems while applying technology to various aspects. However, humanity still faces many challenges, such as starvation and food crises, cancer, and other fatal diseases. We hope that deep learning and artificial intelligence will be more committed to improving human quality of life by undertaking the most challenging scientific research. Last but not least, may our world become a better place.
There are some omissions, but overall it is summarized well, nice.
6.4 Deep Lambertian Networks
Tang et al. (2012) proposed Deep Lambertian Networks (DLN), a multi-layer generative model where the latent variables are reflectance, surface normals, and light sources. DLN is a combination of Lambertian reflectance with Gaussian restricted Boltzmann machines and deep belief networks.
6.5 Generative Adversarial Networks
Goodfellow et al. (2014) proposed Generative Adversarial Networks (GAN), which evaluate generative models through adversarial processes. The GAN architecture consists of a generative model that targets an adversary (i.e., a learning model or a discriminative model of data distribution). Mao et al. (2016), Kim et al. (2017) proposed further improvements to GAN.
Salimans et al. (2016) proposed several methods for training GANs.
6.5.1 Laplacian Generative Adversarial Networks
Denton et al. (2015) proposed a deep generative model (DGM) called Laplacian Generative Adversarial Networks (LAPGAN), using the generative adversarial network (GAN) method. This model also utilizes convolutional networks within the Laplacian pyramid framework.
6.6 Recurrent Support Vector Machines
Shi et al. (2016a) proposed Recurrent Support Vector Machines (RSVM), utilizing recurrent neural networks (RNN) to extract features from input sequences for sequence-level target recognition using standard support vector machines (SVM).
7 Training and Optimization Techniques
In this section, we will briefly outline some key techniques used for regularization and optimization of deep neural networks (DNN).
7.1 Dropout
(Extended methods like dropconnect, etc. … too many)
Srivastava et al. (2014) proposed Dropout to prevent overfitting in neural networks. Dropout is a model averaging regularization method for neural networks, adding noise to its hidden units. During training, it randomly samples units and connections from the neural network. Dropout can be applied to graphical models like RBM (Srivastava et al., 2014) and any type of neural network. A recent improvement on Dropout is Fraternal Dropout, used for recurrent neural networks (RNN).
7.2 Maxout
Goodfellow et al. (2013) proposed Maxout, a new activation function for Dropout. Maxout outputs the maximum value of a set of inputs, benefiting the model averaging of Dropout.
7.3 Zoneout
Krueger et al. (2016) proposed Zoneout, a regularization method for recurrent neural networks (RNN). Zoneout randomly uses noise during training, similar to Dropout, but retains the hidden units instead of dropping them.
7.4 Deep Residual Learning
He et al. (2015) proposed a deep residual learning framework known as ResNet with low training errors.
7.5 Batch Normalization
(Including various variants of Bn and bnd…)
Ioffe and Szegedy (2015) proposed Batch Normalization, a method to accelerate deep neural network training by reducing internal covariate shift. Ioffe (2017) proposed Batch Re-normalization, extending previous methods.
7.6 Distillation
Hinton et al. (2015) proposed a method for transferring knowledge from a highly regularized model ensemble (i.e., neural networks) to a compressed small model.
7.7 Layer Normalization
Ba et al. (2016) proposed Layer Normalization, specifically targeting deep neural networks for RNN to accelerate training, addressing the limitations of batch normalization.
8 Deep Learning Frameworks
There are numerous open-source libraries and frameworks available for deep learning. Most of them are built for the Python programming language, such as Theano, TensorFlow, PyTorch, PyBrain, Caffe, Blocks and Fuel, CuDNN, Honk, ChainerCV, PyLearn2, Chainer, torch, etc.
9 Applications of Deep Learning
In this section, we will briefly discuss some outstanding recent applications in deep learning. Since the inception of deep learning (DL), DL methods have been widely applied in various fields in the form of supervised, unsupervised, semi-supervised, or reinforcement learning. Starting from classification and detection tasks, DL applications are rapidly expanding into every domain.
For example:
-
Image classification and recognition
-
Video classification
-
Sequence generation
-
Defect classification
-
Text, speech, image, and video processing
-
Text classification
-
Speech processing
-
Speech recognition and understanding
-
Text-to-speech generation
-
Query classification
-
Sentence classification
-
Sentence modeling
-
Vocabulary processing
-
Pre-selection
-
Document and sentence processing
-
Generating image captions
-
Photo style transfer
-
Natural image manifolds
-
Image colorization
-
Image question answering
-
Generating textures and stylized images
-
Visual and text question answering
-
Visual recognition and description
-
Object recognition
-
Document processing
-
Character action synthesis and editing
-
Song synthesis
-
Identity recognition
-
Face recognition and verification
-
Video action recognition
-
Human action recognition
-
Action recognition
-
Classification and visualization of motion capture sequences
-
Handwriting generation and prediction
-
Automation and machine translation
-
Named entity recognition
-
Mobile vision
-
Conversational agents
-
Genetic variation calls
-
Cancer detection
-
X-ray CT reconstruction
-
Seizure prediction
-
Hardware acceleration
-
Robotics
etc.
Deng and Yu (2014) provided a detailed list of DL applications in speech processing, information retrieval, object recognition, computer vision, multimodal, and multi-task learning.
Using deep reinforcement learning (Deep Reinforcement Learning, DRL) to master games has become a hot topic today. Currently, artificial intelligence robots are created using DNN and DRL, defeating human world champions and chess masters in strategic and other games, starting from just a few hours of training. For example, AlphaGo and AlphaGo Zero in Go.
10 Discussion
Despite the tremendous success of deep learning in many fields, there is still a long way to go. Many areas remain to be improved. Regarding limitations, there are quite a few examples. For instance, Nguyen et al. showed that deep neural networks (DNN) are easily deceived when recognizing images. Other issues, such as the transferability of learned features proposed by Yosinski et al., also exist. Huang et al. proposed an architecture for defending against neural network attacks, suggesting that future work needs to defend against these attacks. Zhang et al. proposed an experimental framework for understanding deep learning models, arguing that understanding deep learning requires rethinking and generalization.
Marcus provided an important review in 2018 on the role, limitations, and essence of deep learning (Deep Learning, DL). He strongly pointed out the limitations of DL methods, such as the need for more data, limited capacity, inability to handle hierarchical structures, lack of open-ended reasoning, insufficient transparency, inability to integrate with prior knowledge, and inability to distinguish causality. He also mentioned that DL assumes a stable world, achieving approximate methods, is difficult to engineer, and poses potential risks of overhyping. Marcus believes that DL needs to be reconceptualized and seeks possibilities in unsupervised learning, symbolic operations, and hybrid models, gaining insights from cognitive science and psychology, and embracing bolder challenges.
11 Conclusion
Despite the rapid advancement of deep learning (DL) in propelling the world forward, many aspects are still worth researching. We still do not fully understand deep learning and how to make machines smarter, closer to or smarter than humans, or learn like humans. DL has been solving many problems while applying technology to various aspects. However, humanity still faces many challenges, such as starvation and food crises, cancer, and other fatal diseases. We hope that deep learning and artificial intelligence will be more committed to improving human quality of life by undertaking the most challenging scientific research. Last but not least, may our world become a better place.
There are some omissions, but overall it is summarized well, nice.
6.5 Generative Adversarial Networks
Goodfellow et al. (2014) proposed Generative Adversarial Networks (GAN), which evaluate generative models through adversarial processes. The GAN architecture consists of a generative model that targets an adversary (i.e., a learning model or a discriminative model of data distribution). Mao et al. (2016), Kim et al. (2017) proposed further improvements to GAN.
Salimans et al. (2016) proposed several methods for training GANs.
6.5.1 Laplacian Generative Adversarial Networks
Denton et al. (2015) proposed a deep generative model (DGM) called Laplacian Generative Adversarial Networks (LAPGAN), using the generative adversarial network (GAN) method. This model also utilizes convolutional networks within the Laplacian pyramid framework.
6.6 Recurrent Support Vector Machines
Shi et al. (2016a) proposed Recurrent Support Vector Machines (RSVM), utilizing recurrent neural networks (RNN) to extract features from input sequences for sequence-level target recognition using standard support vector machines (SVM).
7 Training and Optimization Techniques
In this section, we will briefly outline some key techniques used for regularization and optimization of deep neural networks (DNN).
7.1 Dropout
(Extended methods like dropconnect, etc. … too many)
Srivastava et al. (2014) proposed Dropout to prevent overfitting in neural networks. Dropout is a model averaging regularization method for neural networks, adding noise to its hidden units. During training, it randomly samples units and connections from the neural network. Dropout can be applied to graphical models like RBM (Srivastava et al., 2014) and any type of neural network. A recent improvement on Dropout is Fraternal Dropout, used for recurrent neural networks (RNN).
7.2 Maxout
Goodfellow et al. (2013) proposed Maxout, a new activation function for Dropout. Maxout outputs the maximum value of a set of inputs, benefiting the model averaging of Dropout.
7.3 Zoneout
Krueger et al. (2016) proposed Zoneout, a regularization method for recurrent neural networks (RNN). Zoneout randomly uses noise during training, similar to Dropout, but retains the hidden units instead of dropping them.
7.4 Deep Residual Learning
He et al. (2015) proposed a deep residual learning framework known as ResNet with low training errors.
7.5 Batch Normalization
(Including various variants of Bn and bnd…)
Ioffe and Szegedy (2015) proposed Batch Normalization, a method to accelerate deep neural network training by reducing internal covariate shift. Ioffe (2017) proposed Batch Re-normalization, extending previous methods.
7.6 Distillation
Hinton et al. (2015) proposed a method for transferring knowledge from a highly regularized model ensemble (i.e., neural networks) to a compressed small model.
7.7 Layer Normalization
Ba et al. (2016) proposed Layer Normalization, specifically targeting deep neural networks for RNN to accelerate training, addressing the limitations of batch normalization.
8 Deep Learning Frameworks
There are numerous open-source libraries and frameworks available for deep learning. Most of them are built for the Python programming language, such as Theano, TensorFlow, PyTorch, PyBrain, Caffe, Blocks and Fuel, CuDNN, Honk, ChainerCV, PyLearn2, Chainer, torch, etc.
9 Applications of Deep Learning
In this section, we will briefly discuss some outstanding recent applications in deep learning. Since the inception of deep learning (DL), DL methods have been widely applied in various fields in the form of supervised, unsupervised, semi-supervised, or reinforcement learning. Starting from classification and detection tasks, DL applications are rapidly expanding into every domain.
For example:
-
Image classification and recognition
-
Video classification
-
Sequence generation
-
Defect classification
-
Text, speech, image, and video processing
-
Text classification
-
Speech processing
-
Speech recognition and understanding
-
Text-to-speech generation
-
Query classification
-
Sentence classification
-
Sentence modeling
-
Vocabulary processing
-
Pre-selection
-
Document and sentence processing
-
Generating image captions
-
Photo style transfer
-
Natural image manifolds
-
Image colorization
-
Image question answering
-
Generating textures and stylized images
-
Visual and text question answering
-
Visual recognition and description
-
Object recognition
-
Document processing
-
Character action synthesis and editing
-
Song synthesis
-
Identity recognition
-
Face recognition and verification
-
Video action recognition
-
Human action recognition
-
Action recognition
-
Classification and visualization of motion capture sequences
-
Handwriting generation and prediction
-
Automation and machine translation
-
Named entity recognition
-
Mobile vision
-
Conversational agents
-
Genetic variation calls
-
Cancer detection
-
X-ray CT reconstruction
-
Seizure prediction
-
Hardware acceleration
-
Robotics
etc.
Deng and Yu (2014) provided a detailed list of DL applications in speech processing, information retrieval, object recognition, computer vision, multimodal, and multi-task learning.
Using deep reinforcement learning (Deep Reinforcement Learning, DRL) to master games has become a hot topic today. Currently, artificial intelligence robots are created using DNN and DRL, defeating human world champions and chess masters in strategic and other games, starting from just a few hours of training. For example, AlphaGo and AlphaGo Zero in Go.
10 Discussion
Despite the tremendous success of deep learning in many fields, there is still a long way to go. Many areas remain to be improved. Regarding limitations, there are quite a few examples. For instance, Nguyen et al. showed that deep neural networks (DNN) are easily deceived when recognizing images. Other issues, such as the transferability of learned features proposed by Yosinski et al., also exist. Huang et al. proposed an architecture for defending against neural network attacks, suggesting that future work needs to defend against these attacks. Zhang et al. proposed an experimental framework for understanding deep learning models, arguing that understanding deep learning requires rethinking and generalization.
Marcus provided an important review in 2018 on the role, limitations, and essence of deep learning (Deep Learning, DL). He strongly pointed out the limitations of DL methods, such as the need for more data, limited capacity, inability to handle hierarchical structures, lack of open-ended reasoning, insufficient transparency, inability to integrate with prior knowledge, and inability to distinguish causality. He also mentioned that DL assumes a stable world, achieving approximate methods, is difficult to engineer, and poses potential risks of overhyping. Marcus believes that DL needs to be reconceptualized and seeks possibilities in unsupervised learning, symbolic operations, and hybrid models, gaining insights from cognitive science and psychology, and embracing bolder challenges.
11 Conclusion
Despite the rapid advancement of deep learning (DL) in propelling the world forward, many aspects are still worth researching. We still do not fully understand deep learning and how to make machines smarter, closer to or smarter than humans, or learn like humans. DL has been solving many problems while applying technology to various aspects. However, humanity still faces many challenges, such as starvation and food crises, cancer, and other fatal diseases. We hope that deep learning and artificial intelligence will be more committed to improving human quality of life by undertaking the most challenging scientific research. Last but not least, may our world become a better place.
There are some omissions, but overall it is summarized well, nice.
6.5.1 Laplacian Generative Adversarial Networks
6.6 Recurrent Support Vector Machines
7 Training and Optimization Techniques
7.1 Dropout
7.2 Maxout
7.3 Zoneout
7.4 Deep Residual Learning
7.5 Batch Normalization
7.6 Distillation
Hinton et al. (2015) proposed a method for transferring knowledge from a highly regularized model ensemble (i.e., neural networks) to a compressed small model.
7.7 Layer Normalization
8 Deep Learning Frameworks
9 Applications of Deep Learning
-
Image classification and recognition
-
Video classification
-
Sequence generation
-
Defect classification
-
Text, speech, image, and video processing
-
Text classification
-
Speech processing
-
Speech recognition and understanding
-
Text-to-speech generation
-
Query classification
-
Sentence classification
-
Sentence modeling
-
Vocabulary processing
-
Pre-selection
-
Document and sentence processing
-
Generating image captions
-
Photo style transfer
-
Natural image manifolds
-
Image colorization
-
Image question answering
-
Generating textures and stylized images
-
Visual and text question answering
-
Visual recognition and description
-
Object recognition
-
Document processing
-
Character action synthesis and editing
-
Song synthesis
-
Identity recognition
-
Face recognition and verification
-
Video action recognition
-
Human action recognition
-
Action recognition
-
Classification and visualization of motion capture sequences
-
Handwriting generation and prediction
-
Automation and machine translation
-
Named entity recognition
-
Mobile vision
-
Conversational agents
-
Genetic variation calls
-
Cancer detection
-
X-ray CT reconstruction
-
Seizure prediction
-
Hardware acceleration
-
Robotics
10 Discussion
11 Conclusion
Source:
https://zhuanlan.zhihu.com/p/85625555

