How BERT Tokenizes Text

How BERT Tokenizes Text

Follow the official account “ML_NLP“ Set as “Starred“, delivering heavy content promptly! Source | Zhihu Link | https://zhuanlan.zhihu.com/p/132361501 Author | Alan Lee Editor | Machine Learning Algorithms and Natural Language Processing Public Account This article is authorized and reposting is prohibited This article was first published on my personal blog on 2019/10/16 and cannot be … Read more

Summary of BERT-Related Models

Summary of BERT-Related Models

©PaperWeekly Original · Author|Xiong Zhiwei School|Tsinghua University Research Direction|Natural Language Processing Since BERT was proposed in 2018, it has gained significant success and attention. Based on this, various related models have been proposed in academia to improve BERT. This article attempts to summarize and organize these models. MT-DNN MT-DNN (Multi-Task DNN) was proposed by Microsoft … Read more

Hands-On Series with Hugging Face Transformers – 03 Analysis of Transformers Model

Hands-On Series with Hugging Face Transformers - 03 Analysis of Transformers Model

In Chapter 2, we saw what is needed to fine-tune and evaluate a Transformer. Now let’s take a look at how they work under the hood. In this chapter, we will explore the main components of the Transformer model and how to implement them using PyTorch. We will also provide guidance on how to do … Read more

Building Language Applications with Hugging Face Transformers

Building Language Applications with Hugging Face Transformers

Hugging Face is a chatbot startup based in New York, focusing on NLP technology, with a large open-source community. Especially, the open-source natural language processing and pre-trained model library, Transformers, has been downloaded over a million times and has more than 24,000 stars on GitHub. Transformers provides a large number of state-of-the-art pre-trained language model … Read more

Understanding Huggingface BERT Source Code: Application Models and Training Optimization

Understanding Huggingface BERT Source Code: Application Models and Training Optimization

Follow our public account “ML_NLP“ Set as “Starred“, heavy content delivered first time! Reprinted from | PaperWeekly ©PaperWeekly Original · Author|Li Luoqiu School|Zhejiang University Master’s Student Research Direction|Natural Language Processing, Knowledge Graph Continuing from the previous article, I will record my understanding of the HuggingFace open-source Transformers project code. This article is based on the … Read more

Hugging Face Official Course Launched: Free NLP Training

Hugging Face Official Course Launched: Free NLP Training

Machine Heart reports Editor: Du Wei The Hugging Face NLP course is now live, and all courses are completely free. Those in the NLP field should be very familiar with the renowned Hugging Face, a startup focused on solving various NLP problems that has brought many beneficial technical achievements to the community. Last year, the … Read more

Understanding Attention Mechanisms in Depth

Understanding Attention Mechanisms in Depth

Recently, I plan to organize the application of Attention in deep recommendation systems, so I wrote this introductory article about Attention. Since it was proposed in the 2015 ICLR paper “Neural machine translation by jointly learning to align and translate”, Attention has flourished in the fields of NLP and computer vision. What is so special … Read more

Understanding Transformers Through Llama Model Architecture

Understanding Transformers Through Llama Model Architecture

Understanding Transformers Through Llama Model Architecture Llama Nuts and Bolts is an open-source project on GitHub that rewrites the inference process of the Llama 3.1 8B-Instruct model (80 billion parameters) from scratch using the Go language. The author is Adil Alper DALKIRAN from Turkey. If you are interested in how LLMs (Large Language Models) and … Read more

Google Proposes New Titans Architecture Beyond Transformers

Google Proposes New Titans Architecture Beyond Transformers

-Titans: Learning to Memorize at Test Time Ali Behrouz† , Peilin Zhong† , and Vahab Mirrokni† Google Research Abstract For more than a decade, extensive research has been conducted on how to effectively utilize recurrent models and attention mechanisms.While recurrent models aim to compress data into fixed-size memories (known as hidden states), attention allows for … Read more

BERT Paper Notes

BERT Paper Notes

Author: Prince Changqin (NLP Algorithm Engineer) Bert, Pre-training of Deep Bidirectional Transformers for Language Understanding Note Paper: https://arxiv.org/pdf/1810.04805.pdf Code: https://github.com/google-research/bert The core idea of Bert: MaskLM utilizes bidirectional context + MultiTask. Abstract BERT obtains a deep bidirectional representation of text by jointly training the context across all layers. Introduction Two methods to apply pre-trained models … Read more