Student projects

This page is under construction.

If you are interested in doing a project or thesis with me, I recommend to check out my publication page to get there is a match between research interests. I list a couple of potential project directions below. You are welcome to propose your own project. Do I need NLP and ML experience? Yes. You should have experience with ML (and ideally also NLP) and generally an interest in solving fundamental questions in Natural Language Processing research. How do I contact you? Send me an email with your background (study program, relevant courses you did, projects you liked, why you want to work on topic X, your expected timeline).

Possible student project topics include:

  • Neural Low-resource PoS tagging: This project aims to build a Part-of-speech tagger with minimal supervision.
  • Neural Named Entity Recognition on Heterogenous Data: This project examines how to build a neural entity tagger to process heterogeneous and noisy data (such as for instance Twitter data: here).
  • Continual Learning for Visual Question Answering: This project aims to investigate continual learning for multimodal visual question answering, see the following paper for more details: pdf
  • Domain adaptation: Any NLP system struggels when the text the data is tested on differs from the training system. A sentiment analysis system trained on books will terribly fail on electronics. This project investigates neural approaches to domain adaptation for example for sentiment classification, relation extraction or identifying products in cybercrime marketplaces.
  • Opinion mining: the web is full of opinionated texts such as review text. This projects aims at extracting opinions from large scale web resources.
  • Error detection: As educational apps increase in popularity, vast amounts of student learning data become available, which can be used for personalized instructions. There are two projects here. Project A: The aim of this project is to develop a neural system that can help a language learner to detect grammatical errors, and improve to using e.g., multi-task learning. See for example here. Project B: The goal of this project is predict future mistakes that learners of English, Spanish, and French will make based on a history of the mistakes they have made in the past. More info here.
  • Data Science for Science: This project is in meta-science: applyng Data Science techniques to Science, such as the use of Natural Language Processing techniques to make sense of the large body of the scientific literature. See for example my CiteTracked paper.
  • Fortuitous data: Often there are additional sources out there than can be explored to build better NLP systems from weak or indirect supervision. See more details here