Fundamentals of Deep Learning: Summary of Attention Mechanism Principles

Click the above“Beginner Learning Visuals” to selectStar or “Pin”

Important content delivered promptly

Generation of Attention

Reason:《Sequence to Sequence Learning with Neural Networks》

Reason for introducing Attention model:

Seq2seq compresses the input sequence into a fixed-size hidden variable, similar to our compressed files. This process is lossy and forces the loss of much information from the input sequence.
There exists an alignment problem. For example, in the Chinese translation “我爱你” (“I love you”), the input sequence’s “我” should align with “I” (the greatest contribution), however, in the seq2seq model, “我” contributes equally to