The New Version You Haven’t Seen: Unveiling the Mathematical Principles of Transformers

The New Version You Haven't Seen: Unveiling the Mathematical Principles of Transformers

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering NLP graduate students, university professors, and corporate researchers. The vision of the community is to promote communication and progress between the academic and industrial sectors of natural language processing and machine learning, especially for the improvement of beginners. … Read more

Comprehensive Tutorial: Visualizing Transformer

Comprehensive Tutorial: Visualizing Transformer

Click the above “Beginner Learn Vision“, select to add “to favorites” or “pin“ Important content delivered in real-time 1. Introduction This article is the second part of the visual AI algorithm tutorial series, and today’s main character is Transformer. Transformer can do many interesting and meaningful things. For example, I previously wrote about “What is … Read more

Who Will Replace Transformer?

Who Will Replace Transformer?

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering graduate students, faculty, and researchers in NLP. The community’s vision is to promote communication and progress between the academic and industrial sectors of natural language processing and machine learning, especially for beginners. Reprinted from | AI Technology Review … Read more

Nine Optimizations for Enhancing Transformer Efficiency

Nine Optimizations for Enhancing Transformer Efficiency

The Transformer has become a mainstream model in the field of artificial intelligence, with a wide range of applications. However, the computational cost of the attention mechanism in Transformers is relatively high, and this cost continues to increase with the length of the sequence. To address this issue, numerous modifications to the Transformer have emerged … Read more

Detailed Explanation of Transformer Structure and Applications

Detailed Explanation of Transformer Structure and Applications

Follow the public account “ML_NLP“ Set as “Starred“, heavy content delivered to you first! Source | Zhihu Address | https://zhuanlan.zhihu.com/p/69290203 Author | Ph0en1x Editor | WeChat public account on Machine Learning Algorithms and Natural Language Processing This article is for academic sharing only. If there is any infringement, please contact us to delete it. This … Read more

Unlocking CNN and Transformer Integration

Unlocking CNN and Transformer Integration

Click the "Little White Learns Vision" above, select to add "Star" or "Top" Heavyweight content, delivered at the first time For academic sharing only, does not represent the position of this public account, contact for deletion if infringing Reprinted from: Machine Heart Due to the complex attention mechanism and model design, most existing visual Transformers … Read more

Understanding Transformer Principles and Implementation in 10 Minutes

Understanding Transformer Principles and Implementation in 10 Minutes

Click the above “Visual Learning for Beginners” to select “Star” or “Pin” Important content delivered at the first time This article is adapted from | Deep Learning This Little Thing Models based on Transformer from the paper “Attention Is All You Need” (such as Bert) have achieved revolutionary results in various natural language processing tasks … Read more

In-Depth Analysis of the Connections Between Transformer, RNN, and Mamba!

In-Depth Analysis of the Connections Between Transformer, RNN, and Mamba!

Source: Algorithm Advancement This article is about 4000 words long and is recommended for an 8-minute read. This article deeply explores the potential connections between Transformer, Recurrent Neural Networks (RNN), and State Space Models (SSM). By exploring the potential connections between seemingly unrelated Large Language Model (LLM) architectures, we may open up new avenues for … Read more

Understanding Transformers and Federated Learning

Understanding Transformers and Federated Learning

The Transformer, as an attention-based encoder-decoder architecture, has not only revolutionized the field of Natural Language Processing (NLP) but has also made groundbreaking contributions in the field of Computer Vision (CV). Compared to Convolutional Neural Networks (CNNs), Vision Transformers (ViT) rely on excellent modeling capabilities, achieving outstanding performance on multiple benchmarks such as ImageNet, COCO, … Read more

Complete Interpretation of Transformer Code

Complete Interpretation of Transformer Code

Author: An Sheng & Yan Yongqiang, Datawhale Members This article has approximately10,000 words, divided into modules to interpret and practice the Transformer. It is recommended tosave and read. In 2017, Google proposed a model called Transformer in a paper titled “Attention Is All You Need,” which is based on the attention (self-attention mechanism) structure to … Read more