Unit 3: Developing LLMs

Published

October 20, 2025

In this unit, you will get an overview of different issues related to the development of large language models. In particular, the unit covers training strategies, the issue of data, emergent abilities of LLMs, and LLM alignment.

Lectures

The lectures begin by introducing the key stages in LLM development. Next, you will learn how LLMs are trained and how large-scale datasets and scaling laws shape their performance. Finally, the lectures explore the emergent abilities of these models and the techniques used to align them with human goals and values.

Section Title Video Slides Quiz
3.1 Introduction to LLM development video slides quiz
3.2 Training LLMs video slides quiz
3.3 Data for LLM pretraining video slides quiz
3.4 Scaling laws video slides quiz
3.5 Emergent abilities of LLMs video slides quiz
3.6 LLM alignment video slides quiz
ImportantQuiz deadline

To earn a wildcard for this unit, you must complete the quizzes no later than 2025-11-03.

Online meeting

During the online meeting, we will examine the ethical and environmental implications of LLMs. In particular, we will discuss the working conditions of human annotators who create alignment data, particularly in countries of the Global South, as well as the environmental costs of training and deploying these models.

TipMeeting details

The meeting will take place on 2025-11-04 between 18:00–20:00. A Zoom link will be sent out via the course mailing list.

Warning

Note that the date of the online meeting was changed at late notice!

Additional materials

Lab

Lab 3 is about pretraining large language models. You will work through the full pretraining process for a GPT model, explore different settings, and implement optimisations that make training more efficient. You will also reflect on the impact of data curation on the quality of the pretrained model.

View the lab on GitLab

ImportantReview deadline

If you want a written review of this lab, you must submit it (via Lisam) no later than 2025-12-19.