How to Incorporate Attention Mechanism in NLP?

Click the “MLNLP” above and select the “Star” public account

Important content, delivered as soon as possible

How to Incorporate Attention Mechanism in NLP?

Editor: Yi Zhen

This article is for academic sharing only; if there is any infringement, it will be deleted.

Reports on machine learning algorithms and natural language processing

How to Incorporate Attention Mechanism in NLP?

Author: Yi Zhen

Hello, reading code is the best way to understand concepts. Here, I provide a simple implementation of dot_attention in Pytorch.

def dot_attention(self, seq, cond, lens):
    :param seq:  (b_s, m_s, h_s)
    :param cond: (b_s, h_s)
    :param lens: [len_1, len_2] the real len of the seq for mask the eos.
    :return: contexts, scores
    scores = cond.unsqueeze(1).expand_as(seq).mul(seq).sum(2)

    # seq = self.dropout(seq)
    max_len = max(lens)

    for i, l in enumerate(lens):
        if l < max_len:
  [i, l:] = -np.inf

    scores = F.softmax(scores, dim=1)

    context = scores.unsqueeze(2).expand_as(seq).mul(seq).sum(1)

    return context, scores  # context (b_s, h_s)  scores (b_s, m_s)

The returned context is the context vector after applying attention, and scores are the attention scores, where the input lens is used to mitigate the impact of padding.

I hope this helps you understand better!

How to Incorporate Attention Mechanism in NLP?

Recommended Reading:

Learning notes on sentence representation

Common normalization methods: BN, LN, IN, GN

Practical NLP classic model BiGRU + CRF detailed explanation

How to Incorporate Attention Mechanism in NLP?

Leave a Comment