Information Extraction

Published

November 13, 2024

Information extraction, also abbreviated IE, is the task of extracting structured data from text, such as extracting entities and relations between them. We also look at sequence labelling as a tool for IE applications.

Lecture Slides

Lab

In this lab, you will implement a pipeline for linking named entities in news articles to the titles of their respective Wikipedia pages.

Download Lab 3

Reading Material

Chapter 8 of Eisenstein (2008) on applications of sequence labelling
Chapter 17 of Eisenstein (2008) on information extraction

Both topics are also covered in the book by Jurafsky & Martin (2024), in Chapters 17 and 20.