Click the “MLNLP” above and select the “Star” public account
Important content, delivered as soon as possible
Editor: Yi Zhen
https://www.zhihu.com/question/349474623
This article is for academic sharing only; if there is any infringement, it will be deleted.
Reports on machine learning algorithms and natural language processing
How to Incorporate Attention Mechanism in NLP?
Author: Yi Zhenhttps://www.zhihu.com/question/349474623/answer/852084282
Hello, reading code is the best way to understand concepts. Here, I provide a simple implementation of dot_attention in Pytorch.
def dot_attention(self, seq, cond, lens):
"""
Arguments
:param seq: (b_s, m_s, h_s)
:param cond: (b_s, h_s)
:param lens: [len_1, len_2] the real len of the seq for mask the eos.
:return: contexts, scores
"""
scores = cond.unsqueeze(1).expand_as(seq).mul(seq).sum(2)
# seq = self.dropout(seq)
max_len = max(lens)
for i, l in enumerate(lens):
if l < max_len:
scores.data[i, l:] = -np.inf
scores = F.softmax(scores, dim=1)
context = scores.unsqueeze(2).expand_as(seq).mul(seq).sum(1)
return context, scores # context (b_s, h_s) scores (b_s, m_s)
The returned context is the context vector after applying attention, and scores are the attention scores, where the input lens is used to mitigate the impact of padding.
I hope this helps you understand better!
Recommended Reading:
Learning notes on sentence representation
Common normalization methods: BN, LN, IN, GN
Practical NLP classic model BiGRU + CRF detailed explanation