A Simple Explanation of Transformer to BERT Models

A Simple Explanation of Transformer to BERT Models

In the past two years, the BERT model has become very popular. Most people know about BERT but do not understand what it specifically is. In short, the emergence of BERT has completely changed the relationship between pre-training to generate word vectors and downstream specific NLP tasks, proposing the concept of training word vectors at … Read more

Hardcore Introduction to NLP – Seq2Seq and Attention Mechanism

Hardcore Introduction to NLP - Seq2Seq and Attention Mechanism

Click the top “MLNLP” to select the “Starred” public account. Heavyweight content delivered first-hand. From:Number Theory Legacy The prerequisite knowledge for this article includes:Recurrent Neural NetworksRNN, Word EmbeddingsWordEmbedding, Gated UnitsVanillaRNN/GRU/LSTM. 1 Seq2Seq Seq2Seq is the abbreviation for sequence to sequence. The first sequence is called the encoder encoder, which is used to receive the source … Read more

Understanding Attention Mechanism in Language Translation

Understanding Attention Mechanism in Language Translation

Author丨Tianyu Su Zhihu Column丨Machines Don’t Learn Address丨https://zhuanlan.zhihu.com/p/27769286 In the previous column, we implemented a basic version of the Seq2Seq model. This model performs sorting of letters, taking an input sequence of letters and returning the sorted sequence. Through the implementation in the last article, we have gained an understanding of the Seq2Seq model, which mainly … Read more

Understanding Attention: Principles, Advantages, and Types

Understanding Attention: Principles, Advantages, and Types

Follow the public account “ML_NLP“ Set as “Starred“, heavy content delivered first time! From | Zhihu Address | https://zhuanlan.zhihu.com/p/91839581 Author | Zhao Qiang Editor | Machine Learning Algorithms and Natural Language Processing Public Account This article is for academic sharing only. If there is any infringement, please contact the backend for deletion. Attention is being … Read more

Overview of Attention Mechanisms: Principles, Variants, and Recent Research

Overview of Attention Mechanisms: Principles, Variants, and Recent Research

Click on the above“Visual Learning for Beginners” to selectStar or “Top” Important information delivered promptly Source | Zhihu Author | Li Xinchun Source | https://zhuanlan.zhihu.com/p/106662375 For academic exchange only, please contact for removal if there is any infringement The Attention mechanism is a very important and effective technique in deep learning. This article will briefly … Read more

Understanding Attention Mechanisms in AI

Understanding Attention Mechanisms in AI

Follow the public account “ML_NLP” Set as “starred” to receive heavy content promptly! Author丨Electric Light Phantom Alchemy @ Zhihu Source丨https://zhuanlan.zhihu.com/p/362366192 Editor丨Machine Learning Algorithms and Natural Language Processing Attention has become a hot topic in the entire AI field, whether in machine vision or natural language processing, it is inseparable from Attention, transformer, or BERT. Below, … Read more

Understanding Transformer and Its Variants

Understanding Transformer and Its Variants

Follow the public account "ML_NLP" Set as “Starred“, heavy content will be delivered to you first! Author: Jiang Runyu, Harbin Institute of Technology SCIR Introduction In recent years, one of the most impressive achievements in the field of NLP is undoubtedly the pre-trained models represented by BERT proposed by Google. They continuously refresh records (both … Read more

Overlooked Details of BERT and Transformers

Overlooked Details of BERT and Transformers

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering NLP master’s and doctoral students, university professors, and corporate researchers. The community’s vision is to promote communication and progress between the academic and industrial sectors of natural language processing and machine learning, especially for beginners. Reprinted from | … Read more

Transformer Advances Towards Dynamic Routing: TRAR for VQA and REC SOTA

Transformer Advances Towards Dynamic Routing: TRAR for VQA and REC SOTA

Follow our public account to discover the beauty of CV technology 1 Introduction Due to its superior capability for modeling global dependencies, the Transformer and its variants have become the primary architecture for many visual and language tasks. However, tasks like Visual Question Answering (VQA) and Referencing Expression Comprehension (REC) often require multi-modal predictions that … Read more

Understanding Attention, Transformer, and BERT Principles

Understanding Attention, Transformer, and BERT Principles

Follow the public account “ML_NLP“ Set as “Starred“, delivering heavy content promptly! Original · Author | TheHonestBob School | Hebei University of Science and Technology Research Direction | Natural Language Processing 1. Introduction There are countless good articles online about this topic, all of which are very detailed. The reason I am writing this blog … Read more