Skip to content
For More Content, Please Follow:
Andrew Ng: Deep Learning Knowledge Explained in 28 Images (Part 1)
Andrew Ng: Deep Learning Knowledge Explained in 28 Images (Part 2)
23-24 Basics of Recurrent Neural Networks
As shown above, sequence problems such as named entity recognition account for a significant proportion of real-life applications, while traditional machine learning algorithms like Hidden Markov Models can only make strong assumptions to handle some sequence problems.
However, recently there have been significant breakthroughs with Recurrent Neural Networks (RNNs) in these areas. The structure of RNN hidden states forms a cycle that serves as memory, where the hidden state at each moment depends on its past state. This structure allows RNNs to store, remember, and process complex signals from a long time ago.
Recurrent Neural Networks (RNNs) can learn features and long-term dependencies from sequential and temporal data. RNNs consist of stacked nonlinear units, with at least one connection between units forming a directed cycle. A trained RNN can model any dynamic system; however, training RNNs is primarily affected by the challenge of learning long-term dependencies.
The following illustrates the applications, issues, and variants of RNNs:
RNNs have powerful capabilities in sequence problems such as language modeling, but they also suffer from severe gradient vanishing issues. Therefore, gated RNNs like LSTM and GRU have great potential; they use gating mechanisms to retain or forget information from previous time steps and form memories to provide for current computations.
25-26 Word Representation in NLP
Word embeddings are very important in natural language processing, as representing words is essential for any task. The above image illustrates methods for word embedding, allowing us to map a vocabulary into a 200 or 300-dimensional vector, significantly reducing the space needed to represent words. Additionally, this method of word representation can also capture the semantics of words, as words with similar meanings are closer together in the embedding space.
In addition to the previously mentioned Skip Grams, common methods for learning word embeddings are also displayed below:
GloVe word vectors are a common method for learning word vectors, and the learned word representations can be further used for tasks such as sentence classification.
27-28 Sequence to Sequence
The most commonly used method for sequence to sequence is the encoder-decoder framework, along with introductions to other modules such as beam search.
The encoder-decoder architecture combined with attention mechanisms can solve many natural language processing problems. The following sections introduce the BLEU score and attention mechanism, which are essential components in machine translation architecture and evaluation.
The above is all the information graphics related to Andrew Ng’s deep learning specialization course. Due to the abundance of information contained, we only introduced a part, and many contents were only briefly covered. Therefore, readers are encouraged to download the infographic and gradually understand and optimize it during their subsequent learning process.
Source: Machine Heart, this article is for academic sharing only, copyright belongs to the original author. If there is any infringement, please contact WeChat: 1306859767, Eternalhui, for deletion or modification!
Past Highlights: