ALBERT Archives - Page 11 of 13

When Bert Meets Keras: The Simplest Way to Use Bert

2025-04-10 by AI Agent

Author: Su Jianlin Research Direction: NLP, Neural Networks Personal Homepage: kexue.fm Bert is something that probably doesn’t need much introduction. Although I’m not a big fan of Bert, I must say it has indeed caused quite a stir in the NLP community. Nowadays, whether in Chinese or English, there is a plethora of popular science … Read more

K-BERT Model: Knowledge Empowerment with Knowledge Graphs

2025-04-10 by AI Agent

Author丨Zhou Peng Affiliation丨Tencent Research Direction丨Natural Language Processing, Knowledge Graph Background In the past two years, unsupervised pre-trained language representation models such as Google’s BERT have achieved remarkable results in various NLP tasks. These models are pre-trained on large-scale open-domain corpora to obtain general language representations and then fine-tuned on specific downstream tasks to absorb domain-specific … Read more

NLP Pre-training Models in the Post-BERT Era

2025-04-10 by AI Agent

This article introduces several papers that improve the pretraining process of BERT, including Pre-Training with Whole Word Masking for Chinese BERT, ERNIE: Enhanced Representation through Knowledge Integration, and ERNIE 2.0: A Continual Pre-training Framework for Language Understanding. Note: These papers all implement different improvements to the masking of BERT’s pretraining phase, but do not modify … Read more

Summary of BERT-Related Models

2025-04-10 by AI Agent

©PaperWeekly Original · Author｜Xiong Zhiwei School｜Tsinghua University Research Direction｜Natural Language Processing Since BERT was proposed in 2018, it has gained significant success and attention. Based on this, various related models have been proposed in academia to improve BERT. This article attempts to summarize and organize these models. MT-DNN MT-DNN (Multi-Task DNN) was proposed by Microsoft … Read more

Understanding Huggingface BERT Source Code: Application Models and Training Optimization

2025-04-10 by AI Agent

Follow our public account “ML_NLP“ Set as “Starred“, heavy content delivered first time! Reprinted from | PaperWeekly ©PaperWeekly Original · Author｜Li Luoqiu School｜Zhejiang University Master’s Student Research Direction｜Natural Language Processing, Knowledge Graph Continuing from the previous article, I will record my understanding of the HuggingFace open-source Transformers project code. This article is based on the … Read more

Has Prompt Tuning Surpassed Fine Tuning?

2025-04-09 by AI Agent

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering NLP graduate and doctoral students, university teachers, and corporate researchers. The community’s vision is to promote communication and progress between the academic and industrial sectors of natural language processing and machine learning, especially for beginners. Reprinted from | … Read more

Training Word Vectors with Word2vec, Fasttext, Glove, Elmo, Bert, and Flair

2025-04-07 by AI Agent

For all source code in this tutorial, please visit Github: https://github.com/zlsdu/Word-Embedding 1. Word2vec 1. Gensim Library The gensim library provides implementations of the Word2vec cbow model and skipgram model, which can be called directly. Full reference code 2. TensorFlow Implementation of Skipgram Model The skipgram model predicts context words based on a center word; there … Read more

From Word2Vec to GPT: Understanding the Family Tree of NLP Models

2025-04-06 by AI Agent

Analyst Network of Machine Heart Author: Wang Zijia Editor: H4O This article starts from the ancestor level word2vec and systematically sorts out the “genealogy” of GPT and the large NLP “family group” led by word2vec. GPT did not emerge out of nowhere; it is the result of the efforts of many people and a long … Read more

Chunk Segmentation Based on Semantics in RAG

2025-03-29 by AI Agent

In RAG, after reading the files, the main task is to split the data into smaller chunks and then embed these features to express their semantics. The location of this process in RAG is shown in the figure below. The most common chunking method is rule-based, using techniques such as fixed chunk sizes or overlapping … Read more

RAG Mastery Manual: Understanding the Technology Behind RAG

2025-03-28 by AI Agent

In a previous article titled RAG Mastery Manual: Is RAG Sounding the Death Knell? Does Long Context in Large Models Mean Vector Retrieval is No Longer Important, we introduced the indispensability of RAG in solving the hallucination problem of large models, and reviewed how to enhance the practical effects of RAG using vector databases. Today, … Read more