How to Incorporate Attention Mechanism in NLP?

Click the “MLNLP” above and select the “Star” public account

Important content, delivered as soon as possible

Editor: Yi Zhen

https://www.zhihu.com/question/349474623

This article is for academic sharing only; if there is any infringement, it will be deleted.

Reports on machine learning algorithms and natural language processing

How to Incorporate Attention Mechanism in NLP?

Author: Yi Zhenhttps://www.zhihu.com/question/349474623/answer/852084282

Hello, reading code is the best way to understand concepts. Here, I provide a simple implementation of dot_attention in Pytorch.

def dot_attention(self, seq, cond, lens):
    """
    Arguments
    :param seq:  (b_s, m_s, h_s)
    :param cond: (b_s, h_s)
    :param lens: [len_1, len_2] the real len of the seq for mask the eos.
    :return: contexts, scores
    """
    scores = cond.unsqueeze(1).expand_as(seq).mul(seq).sum(2)

    # seq = self.dropout(seq)
    max_len = max(lens)

    for i, l in enumerate(lens):
        if l < max_len:
            scores.data[i, l:] = -np.inf

    scores = F.softmax(scores, dim=1)

    context = scores.unsqueeze(2).expand_as(seq).mul(seq).sum(1)

    return context, scores  # context (b_s, h_s)  scores (b_s, m_s)

The returned context is the context vector after applying attention, and scores are the attention scores, where the input lens is used to mitigate the impact of padding.

I hope this helps you understand better!

Recommended Reading:

Learning notes on sentence representation

Common normalization methods: BN, LN, IN, GN

Practical NLP classic model BiGRU + CRF detailed explanation

How to Incorporate Attention Mechanism in NLP?

How to Incorporate Attention Mechanism in NLP?

Leave a Comment Cancel reply