In the past 3-4 years, large language models (LLMs) have fundamentally transformed the field of natural language processing (NLP). They form the basis of state-of-the-art systems and are ubiquitous in solving a wide range of natural language understanding and generation tasks. With unprecedented potential and capabilities, these models also bring new ethical and scalability challenges. This course aims to cover cutting-edge research topics surrounding pre-trained language models. We will discuss their technical foundations (BERT, GPT, T5 models, expert mixture models, retrieval-based models), emerging functionalities (knowledge, reasoning, few-shot learning, in-context learning), fine-tuning and adaptation, system design, and safety and ethics. We will cover each topic and delve into important papers. Students will be expected to regularly read and submit research papers and complete a research project at the end.
This is an advanced graduate course, and all students should have taken machine learning and NLP courses and be familiar with deep learning models such as transformers.
https://www.cs.princeton.edu/courses/archive/fall22/cos597G/
1
Learning Objectives
-
This course aims to help you conduct cutting-edge research in natural language processing, particularly on topics related to pre-trained language models. We will discuss state-of-the-art technologies, their capabilities, and limitations.
-
Practice your research skills, including reading research papers, conducting literature reviews, delivering oral presentations, and providing constructive feedback.
-
Gain practical experience through the final project, from brainstorming to implementation and empirical evaluation, to writing the final paper.
2
Course Content
-
Introduction
-
BERT
-
T5 (encoder-decoder models)
-
GPT-3 (decoder-only models)
-
Prompting for few-shot learning
-
Prompting as parameter-efficient fine-tuning
-
In-context learning
-
Calibration of prompting LLMs
-
Reasoning
-
Knowledge
-
Data
Reference Paper: On the Opportunities and Risks of Foundation Models
-
Authors: Percy Liang, Fei-Fei Li, et al.
-
Paper Link: https://arxiv.org/pdf/2108.07258.pdf
Abstract: Recently, over 100 researchers, including Stanford’s Percy Liang, Rishi Bommasani (a student of Percy Liang), and Fei-Fei Li, jointly published a paper. In the paper, they named large models “foundation models” and systematically explored the opportunities and risks of foundation models. “Foundation” signifies essential but incomplete.
The body of the paper is divided into four parts, detailing the capabilities, applications, related technologies, and social impacts of foundation models, as follows:
-
Capabilities: language, vision, robotics, reasoning, interaction, understanding, etc.;
-
Applications: healthcare, law, education, etc.;
-
Technologies: modeling, training, adaptation, evaluation, systems, data, safety and privacy, robustness, theory, interpretability, etc.;
-
Social impacts: inequality, misuse, environment, regulations, economy, ethics, etc.
The publication of this paper will provide some references for the responsible development and deployment of foundation models.
Scan the QR code to add the assistant on WeChat