Information Extraction
Information extraction, also abbreviated IE, is the task of extracting structured data from text, such as extracting entities and relations between them. We also look at sequence labelling as a tool for IE applications.
Lecture Slides
to be added
Lab
In this lab, you will implement a pipeline for linking named entities in news articles to the titles of their respective Wikipedia pages.
Reading Material
- Chapter 8 of Eisenstein (2008) on applications of sequence labelling
- Chapter 17 of Eisenstein (2008) on information extraction
Both topics are also covered in the book by Jurafsky & Martin (2024), in Chapters 17 and 20.