Assignment 3: Fine-tuning language models

Published

April 27, 2026

In this assignment, you will perform supervised fine-tuning (SFT) of a small open LLM on an instruction tuning dataset. You will convert this dataset into instruction-response pairs, fine-tune a causal language model using LoRA (Low-Rank Adaptation), and evaluate the models.

Pedagogical purposes of this assignment

  • You will learn more about what instruction tuning is and how the model deals with it during training.
  • You will apply LoRA for parameter-efficient tuning of causal LMs.
  • You will get some additional practical experience of working with HuggingFace libraries, which provide useful utilities for preprocessing and training.

Requirements

Submission of this assignment for feedback is optional. If you want feedback, please submit your solution in Canvas.

Submission deadline: May 28.

You can submit a link to a Colab notebook, a link to Github repository, or alternatively a set of Python files or notebooks containing your solution to the programming tasks described below. In addition, include a document indicating which of the assignment tasks you would like to receive feedback for.

This is a pure programming assignment and you do not have to write a technical report or explain details of your solution: at the end of the course, there will be a separate individual oral exam where you will discuss a subset of the assignment tasks.

Practical note

Unlike the two previous assignments, the skeleton for this assignment has been prepared as a Colab notebook. If you prefer to work in another environment, you are free to copy the code into Python files (or a modified notebook for your environment).