Unit 3: Pretraining
In this unit, you will get an overview of different issues related to the development of large language models, with a focus on the pretraining stage. In particular, the unit covers the issue of data, scaling laws, the systems perspective, and the impact of LLMs on the environment.
Lectures
The lectures begin by introducing the key stages in LLM development. Next, you will learn how LLMs are pretrained and how large-scale datasets and scaling laws shape their performance. Finally, the lectures explore the systems perspective of training and the environmental cost of chatbot technology.
| Section | Title | Video | Slides | Quiz |
|---|---|---|---|---|
| 3.1 | Introduction to LLM development | video | slides | quiz |
| 3.2 | Training LLMs | video | slides | quiz |
| 3.3 | Data for LLM pretraining | video | slides | quiz |
| 3.4 | Scaling laws | video | slides | quiz |
| 3.5 | Emergent abilities of LLMs | video | slides | quiz |
| 3.6 | Environmental cost of chatbot technology | video | slides | quiz |
To earn a wildcard for this unit, you must complete the quizzes before the teaching session on Unit 3.
Additional materials
Lab
Lab 3 is about pretraining large language models. You will work through the full pretraining process for a GPT model, explore different settings, and implement optimisations that make training more efficient. You will also reflect on the impact of data curation on the quality of the pretrained model.