Assignment 3: Fine-tuning language models
In this assignment, you will perform supervised fine-tuning (SFT) of a small open LLM on an instruction tuning dataset. You will convert this dataset into instruction-response pairs, fine-tune a causal language model using LoRA (Low-Rank Adaptation), and evaluate the models.
Pedagogical purposes of this assignment
- You will learn more about what instruction tuning is and how the model deals with it during training.
- You will apply LoRA for parameter-efficient tuning of causal LMs.
- You will get some additional practical experience of working with HuggingFace libraries, which provide useful utilities for preprocessing and training.
Requirements
Submission of this assignment for feedback is optional. If you want feedback, please submit your solution in Canvas.
Submission deadline: May 28.
You can submit a link to a Colab notebook, a link to Github repository, or alternatively a set of Python files or notebooks containing your solution to the programming tasks described below. In addition, include a document indicating which of the assignment tasks you would like to receive feedback for.
This is a pure programming assignment and you do not have to write a technical report or explain details of your solution: at the end of the course, there will be a separate individual oral exam where you will discuss a subset of the assignment tasks.
Practical note
Unlike the two previous assignments, the skeleton for this assignment has been prepared as a Colab notebook. If you prefer to work in another environment, you are free to copy the code into Python files (or a modified notebook for your environment).
Link to the assignment notebook
Please make a copy of the template notebook and solve the tasks described in it.