Efficient Transformers Archives - Page 6 of 8

Understanding Attention Mechanism and Transformer in NLP

2025-05-06 by AI Agent

Follow us on WeChat “ML_NLP“ Set as “Starred“, heavy content delivered to you first! Reprinted from | High Energy AI This article summarizes the Attention mechanism in Natural Language Processing (NLP) in a Q&A format and provides an in-depth analysis of the Transformer. Table of Contents 1. Analysis of Attention Mechanism 1. Why introduce the … Read more

Understanding Attention: Principles, Advantages, and Types

2025-05-06 by AI Agent

Follow the public account “ML_NLP“ Set as “Starred“, heavy content delivered first time! From | Zhihu Address | https://zhuanlan.zhihu.com/p/91839581 Author | Zhao Qiang Editor | Machine Learning Algorithms and Natural Language Processing Public Account This article is for academic sharing only. If there is any infringement, please contact the backend for deletion. Attention is being … Read more

Understanding Attention Mechanisms in AI

2025-05-06 by AI Agent

Follow the public account “ML_NLP” Set as “starred” to receive heavy content promptly! Author丨Electric Light Phantom Alchemy @ Zhihu Source丨https://zhuanlan.zhihu.com/p/362366192 Editor丨Machine Learning Algorithms and Natural Language Processing Attention has become a hot topic in the entire AI field, whether in machine vision or natural language processing, it is inseparable from Attention, transformer, or BERT. Below, … Read more

Implementing OCR Character Recognition with Transformer

2025-05-01 by AI Agent

Click on the above “Visual Learning for Beginners“, select to add “Starred” or “Top“ Heavyweight content delivered first-hand Authors: An Sheng, Yuan Mingkun, Datawhale Members In the field of CV, what else can transformers do besides classification? This article will use a word recognition task dataset to explain how to use transformers to implement a … Read more

Introduction and Usage of TrOCR: Transformer-Based OCR

2025-05-01 by AI Agent

Author: Sovit Rath Translator: ronghuaiyang Introduction This article introduces the structure and usage of TrOCR, teaching step by step from each line of code. Optical Character Recognition (OCR) has seen several innovations in recent years. Its impact on retail, healthcare, banking, and many other industries is tremendous. Despite its long history and some state-of-the-art models, … Read more

Essential Knowledge! 5 Major Deep Generative Models!

2025-04-27 by AI Agent

About 5200 words, recommended reading time 10 minutes. This article summarizes commonly used deep learning models, providing an in-depth introduction to their principles and applications. With the rise of models like Sora, diffusion, and GPT, deep generative models have once again become the focus of attention. Deep generative models are a class of powerful machine … Read more

What Are Diffusion Models and Their Advances in Image Generation?

2025-04-27 by AI Agent

Click on the above“Beginner’s Guide to Visuals”, choose to add a“Star” or “Pin” Important insights delivered immediately Perhaps the breakthrough in computer vision and machine learning over the past decade is the invention of GANs (Generative Adversarial Networks)—a method that introduces the possibility of generating content beyond what already exists in the data, serving as … Read more

Distilling Llama3 into Hybrid Linear RNN with Mamba

2025-04-27 by AI Agent

Follow our public account to discover the beauty of CV technology This article is reprinted from Machine Heart. The key to the tremendous success of the Transformer in deep learning is the attention mechanism. The attention mechanism allows Transformer-based models to focus on parts of the input sequence that are relevant, achieving better contextual understanding. … Read more

Distilling Llama3 into Hybrid Linear RNN with Mamba

2025-04-27 by AI Agent

MLNLP community is a well-known machine learning and natural language processing community in China and abroad, covering NLP master’s and doctoral students, university teachers, and researchers from enterprises. The Community’s Vision is to promote communication and progress between the academic and industrial circles of natural language processing and machine learning at home and abroad, especially … Read more

Llama Model Utility Toolkit

2025-04-26 by AI Agent

Project Overview Llama is an easily accessible, open large language model (LLM) designed for developers, researchers, and businesses to build, experiment, and responsibly scale their generative AI ideas. As part of the foundational system, it is a cornerstone for global social innovation. Several key aspects: Open Access: Easy access to cutting-edge large language models, promoting … Read more