Language modelling

Published

September 4, 2024

Language modelling is about predicting which word comes next in a sequence of words – a seemingly simple task that nevertheless serves as a cornerstone for generating and understanding human language through computers. In this unit, you will learn about two types of language models: \(n\)-gram models and neural models, focusing on models based on recurrent neural networks.

Lectures

This unit begins with an overview of language modelling. It highlights the historical significance of \(n\)-gram models in NLP, which laid the foundation for the transition to neural language models. We continue with an exploration of pre-Transformer neural architectures for language modelling, specifically focusing on recurrent neural networks (RNNs) and the pivotal Long Short-Term Memory (LSTM) architecture.

Section	Title	Video	Slides	Quiz
1.1	Introduction to language modelling	video	slides	quiz
1.2	N-gram language models	video	slides	quiz
1.3	Neural language models	video	slides	quiz
1.4	Recurrent neural networks (RNNs)	video	slides	quiz
1.5	The LSTM architecture	video	slides	quiz
1.6	RNN language models	video	slides	quiz

Lab

In the lab for this unit, you will implement and train two neural language models presented in the lectures: the fixed-window model and the recurrent neural network model. You will evaluate these models by computing their perplexity on a standard benchmark for language modelling: the WikiText-2 dataset.

Link to the lab (course repo)