Language modelling

Published

January 15, 2024

Neural language models

Level: Basic (22 points)

In this lab, you will implement and train two neural language models presented in the video lectures: the fixed-window model and the recurrent neural network model. You will evaluate these models by computing their perplexity on a standard benchmark for language modelling: the WikiText-2 dataset.

Link to the basic lab

Interpolated n-gram model

Level: Advanced (33 points)

Neural language models require substantial computational resources. Where these are not available, the older generation of probabilistic language models can make a strong baseline. Your task in this lab is to evaluate one of these models on the WikiText-2 dataset. More specifically, you will implement an interpolated \(n\)-gram model.

Link to the advanced lab