Week 3

newsletter
Newsletter for the week 2026-02-02/2026-02-06
Author

Marco Kuhlmann

Published

February 6, 2026

Dear all,

We have reached the end of Week 3, and I hope the course is going well for you. Here is my weekly newsletter!

As always, please do not hesitate to ask if anything is unclear. You can contact me via email or book an appointment.

Best, Marco

This week: LLM-architecture

In this unit, you explored the Transformer, which underpins today’s large language models. You also learned about the two main types of language models built on this architecture: decoder-based models (such as GPT) and encoder-based models (such as BERT). The lab guided you through a from-scratch implementation of the GPT architecture.

Blog post about attention

In Monday’s lecture, I walked you through a detailed example of how to compute attention. I also discussed the general characterisation of attention in terms of queries, keys, and values, and how this characterisation is related to Python’s dictionary data structure. If you would like to revisit this at your own pace, I summarised everything in a blog post:

Blog post on attention

Seminar materials available

Yesterday (Tuesday), we held the third of four project-related seminars. In that seminar, we reviewed the instructions and grading criteria for the post-project paper, along with example reports. If you were unable to attend the seminar, the slides of my presentation and the example reports are available through the course website:

Project page

Survey: First two weeks

Thank you to all who already completed our survey after the first two weeks of the course. I plan to present the results in Monday’s teaching session, and the form will be open until then:

Survey: First two weeks

General review of the lab portfolio

If you need help with the labs or want feedback on your work, you can always ask your tutor during a lab session or contact them via email. Additionally, on two occasions during the course, you can ask for a general review of your portfolio. The first of these opportunities is now after Unit 2, so feel free to make use of it.

What is a “general review” of the portfolio, you ask? During the general review, your tutor checks whether your portfolio, in its current form, meets the grading criteria for the lab assignments, which are required to take the oral exam. In particular, your tutor will check whether you have completed all tasks.

You can also use the general review to ask for feedback on specific tasks that you have not been able to get feedback on in one of the lab sessions. For this, please clearly highlight in your portfolio (a) which tasks you want feedback on and (b) what feedback you are looking for – for example, “Is our solution correct?” or “Is the presentation clear enough?”.

Next week: Pretraining

Unit 3 will provide an overview of key issues in the development of large language models, with a focus on the pretraining stage. In particular, the unit covers data, scaling laws, the systems perspective, and the impact of LLMs on the environment. In the lab, you will work through the full pretraining process for your GPT model, explore different settings, and implement optimisations that make training more efficient.