Understanding Huggingface BERT Source Code: Application Models and Training Optimization

Understanding Huggingface BERT Source Code: Application Models and Training Optimization

Follow our public account “ML_NLP“ Set as “Starred“, heavy content delivered first time! Reprinted from | PaperWeekly ©PaperWeekly Original · Author|Li Luoqiu School|Zhejiang University Master’s Student Research Direction|Natural Language Processing, Knowledge Graph Continuing from the previous article, I will record my understanding of the HuggingFace open-source Transformers project code. This article is based on the … Read more

Has Prompt Tuning Surpassed Fine Tuning?

Has Prompt Tuning Surpassed Fine Tuning?

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering NLP graduate and doctoral students, university teachers, and corporate researchers. The community’s vision is to promote communication and progress between the academic and industrial sectors of natural language processing and machine learning, especially for beginners. Reprinted from | … Read more

Training Word Vectors with Word2vec, Fasttext, Glove, Elmo, Bert, and Flair

Training Word Vectors with Word2vec, Fasttext, Glove, Elmo, Bert, and Flair

For all source code in this tutorial, please visit Github: https://github.com/zlsdu/Word-Embedding 1. Word2vec 1. Gensim Library The gensim library provides implementations of the Word2vec cbow model and skipgram model, which can be called directly. Full reference code 2. TensorFlow Implementation of Skipgram Model The skipgram model predicts context words based on a center word; there … Read more

From Word2Vec to GPT: Understanding the Family Tree of NLP Models

From Word2Vec to GPT: Understanding the Family Tree of NLP Models

Analyst Network of Machine Heart Author: Wang Zijia Editor: H4O This article starts from the ancestor level word2vec and systematically sorts out the “genealogy” of GPT and the large NLP “family group” led by word2vec. GPT did not emerge out of nowhere; it is the result of the efforts of many people and a long … Read more

Chunk Segmentation Based on Semantics in RAG

Chunk Segmentation Based on Semantics in RAG

In RAG, after reading the files, the main task is to split the data into smaller chunks and then embed these features to express their semantics. The location of this process in RAG is shown in the figure below. The most common chunking method is rule-based, using techniques such as fixed chunk sizes or overlapping … Read more

RAG Mastery Manual: Understanding the Technology Behind RAG

RAG Mastery Manual: Understanding the Technology Behind RAG

In a previous article titled RAG Mastery Manual: Is RAG Sounding the Death Knell? Does Long Context in Large Models Mean Vector Retrieval is No Longer Important, we introduced the indispensability of RAG in solving the hallucination problem of large models, and reviewed how to enhance the practical effects of RAG using vector databases. Today, … Read more

BERT Paper Notes

BERT Paper Notes

Author: Prince Changqin (NLP Algorithm Engineer) Bert, Pre-training of Deep Bidirectional Transformers for Language Understanding Note Paper: https://arxiv.org/pdf/1810.04805.pdf Code: https://github.com/google-research/bert The core idea of Bert: MaskLM utilizes bidirectional context + MultiTask. Abstract BERT obtains a deep bidirectional representation of text by jointly training the context across all layers. Introduction Two methods to apply pre-trained models … Read more

Understanding Google’s Powerful NLP Model BERT

Understanding Google's Powerful NLP Model BERT

▲ Click on the top Leiphone to follow Written by | AI Technology Review Report from Leiphone (leiphone-sz) Leiphone AI Technology Review notes: This article is an interpretation provided by Pan Shengfeng from Zhuiyi Technology based on Google’s paper for AI Technology Review. Recently, Google researchers achieved state-of-the-art results on 11 NLP tasks with the … Read more

Reviewing Progress and Insights on BERT Models

Reviewing Progress and Insights on BERT Models

Authorized Reprint from Microsoft Research AI Headlines Since BERT was published on arXiv, it has gained significant success and attention, opening the Pandora’s box of 2-Stage in NLP. Subsequently, a large number of pre-trained models similar to “BERT” have emerged, including the generalized autoregressive model XLNet that introduces bidirectional context information from BERT, as well … Read more

Training CT-BERT on COVID-19 Data from Twitter

Training CT-BERT on COVID-19 Data from Twitter

Big Data Digest authorized repost from Data Party THU Author: Chen Zhiyan Twitter has always been an important source of news, and during the COVID-19 pandemic, the public has been able to express their anxieties on Twitter. However, manually classifying, filtering, and summarizing the massive amount of COVID-19 information on Twitter is nearly impossible. This … Read more