ALBERT Archives - Page 2 of 14

Understanding the Working Principle of GPT’s Transformer Technology

2025-07-05 by AI Agent

Introduction The Transformer was proposed in the paper“Attention is All You Need”, and is now the recommended reference model for Google Cloud TPU. By introducing self-attention mechanisms and positional encoding layers, it effectively captures long-distance dependencies in input sequences and performs excellently when handling long sequences. Additionally, the parallel computing capabilities of the Transformer model … Read more

What Is the Transformer Model?

2025-07-05 by AI Agent

Welcome to the special winter vacation column “High-Tech Lessons for Kids” brought to you by Science Popularization China! Artificial intelligence, as one of the most cutting-edge technologies today, is changing our lives at an astonishing speed. From smart voice assistants to self-driving cars, from AI painting to machine learning, it opens up a future full … Read more

Why Is Your Saved BERT Model So Large?

2025-07-03 by AI Agent

Follow the public account “ML_NLP” Set as “Starred”, heavy content delivered first-hand! Produced by Machine Learning Algorithms and Natural Language Processing Original Column Author on Public Account Liu Cong School | NLP Algorithm Engineer A while ago, a friend asked me this question: the ckpt file size of the bert-base model provided by Google is … Read more

Understanding BERT: Interview Questions and Insights

2025-07-03 by AI Agent

Follow the WeChat public account “ML_NLP“ Set as “starred” for heavy content delivery! Author | Adherer Organizer | NewBeeNLP Interview tips knowledge compilation series, continuously updated Full of valuable content, recommended to collect, or as usual, see you in the background (code: BT) 1. What Is the Basic Principle of BERT? BERT comes from Google’s … Read more

Summary of Contrastive Learning Papers from ACL 2021

2025-07-02 by AI Agent

MLNLP(Machine Learning Algorithms and Natural Language Processing) community is one of the largest natural language processing communities both domestically and internationally, gathering over 500,000 subscribers, with an audience covering NLP master’s and PhD students, university teachers, and corporate researchers. The Vision of the Community is to promote communication and progress between academia, industry, and enthusiasts … Read more

Must-See! Princeton’s Chen Danqi Latest Course on Understanding Large Language Models 2022!

2025-06-20 by AI Agent

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering NLP graduate students, teachers from universities, and researchers from enterprises. The vision of the community is to promote communication and progress between the academic and industrial circles of natural language processing and machine learning, especially for the progress … Read more

Intelligent Manufacturing Model and Technology Paper Recommendations

2025-06-20 by AI Agent

Paper Title Intelligent Manufacturing Maturity Assessment Method Based on BERT and TextCNN Authors Zhang Gan1, Yuan Tangxiao1,2, Wang Huifen1 (Corresponding Author), Liu Linyan1 Affiliations 1. School of Mechanical Engineering, Nanjing University of Science and Technology 2. Lorrain University LCOMS Funding Supported by the High-end Foreign Experts Introduction Program of the Ministry of Science and Technology … Read more

XLNet Pre-training Model: Everything You Need to Know

2025-06-20 by AI Agent

Author | mantch Reprinted from WeChat Official Account | AI Technology Review 1. What is XLNet XLNet is a model similar to BERT, rather than a completely different model. In short, XLNet is a general autoregressive pre-training method. It was released by the CMU and Google Brain teams in June 2019, and ultimately, XLNet outperformed … Read more

Automatic Scoring System for Subjective Questions Based on Siamese Network and BERT Model

2025-06-20 by AI Agent

Article Title: Automatic Scoring System for Subjective Questions Based on Siamese Network and BERT Model All Authors: Qian Shenghua First Institution: Beijing Normal University, School of Artificial Intelligence Publication Date: 2022, 31(3): 143–149 Abstract Summary Due to the current lack of automatic scoring for subjective questions in multilingual education, this paper proposes an automatic scoring … Read more