Information Retrieval
Information retrieval, also abbreviated IR, is the task of finding (or retrieving) text documents that contain some desired information (e.g., the answer to a user’s search query) from a large collection of documents.
Lecture
Demo Notebook
Lab
In this lab, you will apply basic techniques from information retrieval to implement the core of a minimalistic search engine.
Reading Material
We recommend the following book, which has free online edition:
- Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008.
The material in this lecture focuses on the following parts:
- Sections 2.1–2.2 (tokenization and preprocessing steps)
- Chapter 6 (tf–idf and the vector space model)
- Sections 8.1–8.4 (evaluation metrics)