Transformers Archives - Page 2 of 4

Summary of BERT-Related Models

2025-04-10 by AI Agent

©PaperWeekly Original · Author｜Xiong Zhiwei School｜Tsinghua University Research Direction｜Natural Language Processing Since BERT was proposed in 2018, it has gained significant success and attention. Based on this, various related models have been proposed in academia to improve BERT. This article attempts to summarize and organize these models. MT-DNN MT-DNN (Multi-Task DNN) was proposed by Microsoft … Read more

Hands-On Series with Hugging Face Transformers – 03 Analysis of Transformers Model

2025-04-10 by AI Agent

In Chapter 2, we saw what is needed to fine-tune and evaluate a Transformer. Now let’s take a look at how they work under the hood. In this chapter, we will explore the main components of the Transformer model and how to implement them using PyTorch. We will also provide guidance on how to do … Read more

Building Language Applications with Hugging Face Transformers

2025-04-10 by AI Agent

Hugging Face is a chatbot startup based in New York, focusing on NLP technology, with a large open-source community. Especially, the open-source natural language processing and pre-trained model library, Transformers, has been downloaded over a million times and has more than 24,000 stars on GitHub. Transformers provides a large number of state-of-the-art pre-trained language model … Read more

Understanding Huggingface BERT Source Code: Application Models and Training Optimization

2025-04-10 by AI Agent

Follow our public account “ML_NLP“ Set as “Starred“, heavy content delivered first time! Reprinted from | PaperWeekly ©PaperWeekly Original · Author｜Li Luoqiu School｜Zhejiang University Master’s Student Research Direction｜Natural Language Processing, Knowledge Graph Continuing from the previous article, I will record my understanding of the HuggingFace open-source Transformers project code. This article is based on the … Read more

Hugging Face Official Course Launched: Free NLP Training

2025-04-10 by AI Agent

Machine Heart reports Editor: Du Wei The Hugging Face NLP course is now live, and all courses are completely free. Those in the NLP field should be very familiar with the renowned Hugging Face, a startup focused on solving various NLP problems that has brought many beneficial technical achievements to the community. Last year, the … Read more

Understanding Attention Mechanisms in Depth

2025-04-02 by AI Agent

Recently, I plan to organize the application of Attention in deep recommendation systems, so I wrote this introductory article about Attention. Since it was proposed in the 2015 ICLR paper “Neural machine translation by jointly learning to align and translate”, Attention has flourished in the fields of NLP and computer vision. What is so special … Read more

Understanding Transformers Through Llama Model Architecture

2025-03-30 by AI Agent

Understanding Transformers Through Llama Model Architecture Llama Nuts and Bolts is an open-source project on GitHub that rewrites the inference process of the Llama 3.1 8B-Instruct model (80 billion parameters) from scratch using the Go language. The author is Adil Alper DALKIRAN from Turkey. If you are interested in how LLMs (Large Language Models) and … Read more

Google Proposes New Titans Architecture Beyond Transformers

2025-03-27 by AI Agent

-Titans: Learning to Memorize at Test Time Ali Behrouz† , Peilin Zhong† , and Vahab Mirrokni† Google Research Abstract For more than a decade, extensive research has been conducted on how to effectively utilize recurrent models and attention mechanisms.While recurrent models aim to compress data into fixed-size memories (known as hidden states), attention allows for … Read more

BERT Paper Notes

2025-03-25 by AI Agent

Author: Prince Changqin (NLP Algorithm Engineer) Bert, Pre-training of Deep Bidirectional Transformers for Language Understanding Note Paper: https://arxiv.org/pdf/1810.04805.pdf Code: https://github.com/google-research/bert The core idea of Bert: MaskLM utilizes bidirectional context + MultiTask. Abstract BERT obtains a deep bidirectional representation of text by jointly training the context across all layers. Introduction Two methods to apply pre-trained models … Read more

BERT Implementation in PyTorch: A Comprehensive Guide

2025-03-24 by AI Agent

Selected from GitHub Author: Junseong Kim Translated by Machine Heart Contributors: Lu Xue, Zhang Qian Recently, Google AI published an NLP paper introducing a new language representation model, BERT, which is considered the strongest pre-trained NLP model, setting new state-of-the-art performance records on 11 NLP tasks. Today, Machine Heart discovered a PyTorch implementation of BERT … Read more