Why Is Your Saved BERT Model So Large?

Why Is Your Saved BERT Model So Large?

Follow the public account “ML_NLP” Set as “Starred”, heavy content delivered first-hand! Produced by Machine Learning Algorithms and Natural Language Processing Original Column Author on Public Account Liu Cong School | NLP Algorithm Engineer A while ago, a friend asked me this question: the ckpt file size of the bert-base model provided by Google is … Read more

Stabilizing BERT Fine-tuning on Small Datasets

Stabilizing BERT Fine-tuning on Small Datasets

Follow our public account “ML_NLP“ Set as “Starred“, heavy content delivered first! Author:Qiu Zhenyu (Algorithm Engineer, Huatai Securities Co., Ltd.) Zhihu Column:My AI Journey Recently, I came across a paper titled “Revisiting Few-sample BERT Fine-tuning”. The paper has just been released on arXiv, and although it hasn’t attracted much attention yet, I found it very … Read more

Understanding BERT and HuggingFace Transformers Fine-Tuning

Understanding BERT and HuggingFace Transformers Fine-Tuning

This article is also published on my personal website, where the formula images display better. Welcome to visit: https://lulaoshi.info/machine-learning/attention/bert Since the emergence of BERT (Bidirectional Encoder Representations from Transformer) [1], a new paradigm has opened up in the field of NLP. This article mainly introduces the principles of BERT and how to use the transformers … Read more

Integrating Knowledge into Text Classification with KPT

Integrating Knowledge into Text Classification with KPT

Source: TsinghuaNLP, Deep Learning Natural Language Processing This article is about 2400 words long and is recommended to be read in 5 minutes. This article uses a knowledge base to expand and improve label words, achieving better text classification results. Background Using Prompt Learning for text classification tasks is an emerging method that leverages pre-trained … Read more

CMU Liu Pengfei: The Fourth Paradigm of NLP

CMU Liu Pengfei: The Fourth Paradigm of NLP

Written by | Liu Pengfei Edited by | Jia Wei Source | AI Technology Review In the past two years, the research paradigm based on pre-training + fine-tuning has rapidly swept the entire field of NLP. This research paradigm is widely recognized as a revolutionary paradigm in NLP research, with previous paradigms including “expert systems,” … Read more

Prompt, RAG, Fine-Tuning, or Training From Scratch? Choosing the Right Generative AI Approach

Prompt, RAG, Fine-Tuning, or Training From Scratch? Choosing the Right Generative AI Approach

Source: DeepHub IMBA This article is approximately 2600 words and suggests a 5-minute reading time. This article will attempt to provide recommendations for choosing the correct generative AI methods based on some common quantifiable metrics. Generative AI is rapidly evolving, and many people are trying to use this technology to solve their business problems. Generally, … Read more

In-Depth Guide to Prompt Learning and Tuning

In-Depth Guide to Prompt Learning and Tuning

MLNLP community is a well-known machine learning and natural language processing community in China and abroad, targeting NLP graduate students, university teachers, and corporate researchers. The vision of the community is to promote communication and progress between the academia and industry of natural language processing and machine learning, especially for the advancement of beginners. Reprinted … Read more

Step-by-Step Guide to Fine-Tuning QWEN2.5

Step-by-Step Guide to Fine-Tuning QWEN2.5

Introduction This practical guide uses the 0.5B model of QWEN2.5 for fine-tuning on the Ruozhi Bar dataset. As we all know, there are many absurd questions in the Ruozhi Bar.Although these nonsensical questions may seem like forced ambiguity of Chinese semantics from a human perspective, they actually provide high-quality training data for the model to … Read more

The Importance of Refocusing Attention in Fine-Tuning Large Models

The Importance of Refocusing Attention in Fine-Tuning Large Models

Click the "Xiaobai Learns Vision" above, select to add "star" or "top" Heavyweight content delivered to you first Author丨Baifeng@Zhihu (Authorized) Source丨https://zhuanlan.zhihu.com/p/632301499 Editor丨Jishi Platform Jishi Guide Surpassing fine-tuning, LoRA, VPT, etc. with only a small number of parameters fine-tuned! Paper link: https://arxiv.org/pdf/2305.15542 GitHub link: https://github.com/bfshi/TOAST We found that when fine-tuning large models on a downstream task, … Read more

Using GPT-4 to Generate Training Data for Fine-tuning GPT-3.5 RAG Pipeline

Using GPT-4 to Generate Training Data for Fine-tuning GPT-3.5 RAG Pipeline

Source: DeepHub IMBA This article is about 3200 words long, and it is recommended to read for 6 minutes. This article explores the new integration of LlamaIndex for fine-tuning OpenAI's GPT-3.5 Turbo. OpenAI announced on August 22, 2023, that fine-tuning of GPT-3.5 Turbo is now possible. This means we can customize our own models. Subsequently, … Read more