Project ideas
The main purpose of the project is to give you the opportunity to identify, assess, and use NLP research literature (learning outcome 4). You will also deepen the knowledge you have acquired in the other parts of the course.
Project ideas
Here is a list of project ideas. For each idea, I also identify some of the challenges I would expect for the project in question. You can modify any project idea to your liking or propose an entirely new project.
Fine-tuning a pretrained language model
- Goal and Ideas
- Fine-tune an open-source LLM like BERT or GPT (e.g., from Hugging Face) for a specific downstream task, such as sentiment analysis on a specialised domain, text summarisation for a niche dataset (e.g., scientific papers), or named entity recognition for a custom dataset.
- Challenges
- Finding, curating and cleaning the dataset. Handling class imbalance. Avoiding overfitting during fine-tuning.
Prompt engineering and few-shot learning
- Goal and Ideas
- Investigate the performance of prompt-based learning techniques with a few-shot setting. For example, you could compare prompt-based approaches to fine-tuned models on the same task.
- Challenges
- Finding a good task and dataset. Crafting effective prompts. Analysing errors. Ensuring reproducibility.
Embedding-based search and similarity
- Goal and Ideas
- Build a semantic search system using embeddings from a pretrained model. For example, you could use sentence embeddings (e.g., from SBERT) to implement a document retrieval system for a dataset like ArXiv papers.
- Challenges
- Optimise the embedding storage/retrieval pipeline (e.g., using FAISS or Annoy).
Bias and fairness in NLP models
- Goal and Ideas
- Analyse biases in pretrained LLMs and suggest ways to mitigate them. For example, you could investigate gender or racial bias in text classification or generation task. There are established metrics (e.g., WEAT), and you could evaluate interventions like counterfactual data augmentation.
- Challenges
- Finding datasets to test biases and ensuring fair comparisons of mitigation strategies.
Tweaking the GPT implementation
- Goal and Ideas
- Experiment with architectural tweaks to our implementation of GPT. For example, you could use different activation functions, other attention mechanisms, different positional embeddings) to analsze their impact on accuracy and efficiency.
- Challenges
- Debugging the implementation and scaling for small experiments.
Data augmentation for low-resource NLP
- Goal and Ideas
- Investigate the effectiveness of data augmentation techniques in low-resource settings. For example, you could implement and evaluate the impact of back-translation, paraphrasing, or synonym replacement to augment training data for a classification task.
- Challenges
- Implementing augmentation pipelines and ensuring they produce meaningful improvements.
Explainability for LLMs
- Goal and Ideas
- Implement and evaluate explainability techniques for predictions from LLMs. For example, you could use attention visualisation or saliency maps to explain a model’s predictions for text classification. You could compare post-hoc explanation methods (e.g., LIME) with inherent interpretability methods (e.g., Transformer Lens).
- Challenges
- Making explanations clear and evaluating the quality of explanations quantitatively or qualitatively.
Text-to-image or image-to-text tasks
- Goal and Ideas
- Combine NLP with computer vision for multi-modal tasks. For example, you could fine-tune CLIP for domain-specific image-text retrieval or investigate caption generation for domain-specific datasets (e.g., medical images).
- Challenges
- Handling multi-modal datasets and evaluating results effectively.
Projects from previous years
For examples of topics that have been explored in the course, see the abstracts from previous years: