How NLP Beginners Should Start in the Era of Large Models

The entry point is simple and straightforward: Build some essential foundation and then sprint into Transformers.

In the era of large models, traditional algorithms such as word segmentation and part-of-speech tagging have been largely replaced, so there is no need to spend too much energy on traditional algorithms at the beginning stage.

Mathematics and Programming Basics

Mathematics:

Calculus, linear algebra, probability and statistics. University-level knowledge is sufficient, and if your foundation is weak, you can learn and supplement it later.

Python:

The recommended language is Python, which is almost unavoidable. You don’t need to learn too deeply; just mastering the basic syntax, data types, control structures (such as loops and conditional statements), and functions in Python is enough.

Recommended resource: Bilibili Xiaojiaoyu.

PyTorch:

One of the mainstream frameworks for deep learning.

Recommended: Bilibili Liu Er Da Ren’s “Practical Deep Learning with PyTorch”, and Wo Shi Tu Dui’s “Quick Start Guide to Deep Learning with PyTorch”.

Sprinting into Transformers

Learn the basic architecture and principles of the Transformer model, including self-attention mechanisms, positional encoding, multi-head attention, etc.

Recommended materials:

Andrew Ng’s deep learning series courses.

Stanford CS224 – Deep Learning for Natural Language Processing.

Li Mu’s “Hands-On Deep Learning”.

All of these are classics, choose the ones you can understand and complete the assignments to build a complete knowledge system.

Hugging Face Transformers: Use the Hugging Face Transformers library to load, train, evaluate models, and complete downstream NLP tasks.

Pre-trained Large Language Models

In recent years, with the rise of GPT-4, Llama, and other models, the research, application, and development of pre-trained large models have received widespread attention. Especially now, as companies begin to implement applications, the job market shows a gap on one hand, and on the other hand, there are very few talents with comprehensive project experience.

Understanding the entire knowledge system of pre-trained large models, including common pre-trained models, model structures, and major pre-training tasks, is crucial for both research and employment. PEFT (Parameter-Efficient Fine-tuning) is something to learn. While the feasibility of training a large language model is relatively low, fine-tuning is something everyone can practice. Additionally, you should look into LangChain for downstream task development.

Project Practice

In addition to participating in school laboratory projects, doing open-source projects, and internships are methods to gain project practice opportunities. Participating in competitions is also a good way. These competition projects usually provide basic datasets and problems to solve, along with some baseline code as reference, which is very helpful for beginners.

1) Kaggle: A well-known competition community with many interesting datasets and tasks. You can download relevant datasets by participating in Kaggle machine learning competitions.

2) Tianchi Competition: A competition hosted by Alibaba Cloud, completely derived from real business scenarios. The topics and datasets accumulated in each competition are retained and opened on Tianchi.

There are many other competitions in China, such as Hejing, Huawei Cloud, DataFountain, etc.

In the era of large models, considering cost and security, it is quite common to choose to deploy a set of private large models in practice. Therefore, project practice should focus not only on coding skills but also on engineering capabilities.

Reading Classic Papers and Accumulating Coding Experience

Reading papers is an important way to acquire knowledge and understand the latest developments. One is to read classic papers in specific fields, including baselines; the other is to explore cutting-edge solutions. For unfamiliar knowledge points mentioned in the papers, make a conscious effort to learn; you can also expand your reading range by following citations and references in the papers.

Going back to supplement the basics: Traditional algorithm foundational knowledge is significant for model interpretability and debugging. Therefore, after mastering modern models like Transformers, you can go back to learn these traditional algorithm basics to achieve a more comprehensive understanding of the essence and application of NLP technology.

Preparing for interviews: In addition to theoretical knowledge, project experience, and internship experience, it is essential to set aside time specifically for interview preparation, especially given the current competitive environment. Try to go through as many LeetCode problems as possible and read interview experience shares.

For AIGC algorithm engineer roles, it is advisable to create a dedicated resume, as it can be quite valuable.

Finally, Resources You Might Use:

Complete AI System Course: Click to claim~

Python Course: Click to claim~

PyTorch Course: Click to claim~

Interview Related: Click to claim~

Wishing you success soon!