Students who want to do an independent study or IR Lab with me can either i) propose their own topic, or ii) choose a topic from the list below. Typically a student must have completed Search Engines (11-442/11-642) before doing an IR independent study or lab.
Adaptive Filtering of Microblog Text: Create a system that performs adaptive filtering of a Twitter microblog stream. The system will incrementally learn filtering profiles ("queries", "classifiers") and dissemination thresholds. There are different ways to learn dissemination thresholds, but I am most interested in approaches based on score modeling and sampling. The system could be built on top of the Indri or Lucene open-source search engines, which requires learning the API of the search engine, but avoids implementing a document parser. Evaluation can be done with TREC 2011-2013 Microblog Track data.