Research Interests: Information Retrieval, Natural Language Processing, Machine Learning
Current Research: As part of my PhD thesis research I am investigating the problem of efficient and effective search of large-scale document collections.
Search engine indexes for large document collections are often divided into multiple disjoint partitions ('shards') that are distributed across multiple computers and searched in parallel to provide rapid interactive search.
Typically, all index shards are searched for each query (exhaustive search).
My research proposes an alternative, 'selective search', that partitions collections into topical shards and searches only a few relevant shards for each query.
As per the 'cluster hypothesis' ('similar documents tend to be relevant to the same request') topical organization of the document collection has the effect of concentrating the relevant documents for any given query into a few shards.
Such an organization of documents enables selective search to ignore large portions of the collections without degrading the search accuracy.
In summary, selective search is an efficient alternative to the current de-facto search paradigm of exhaustive search.
The topical shard definitions for three datasets that I have used in my research are available here.
(Relatively) Recent Research Activities
* I successfully proposed my PhD thesis titled 'Efficient and Effective Large-scale Search using Selective Searching' in summer 2011.
* During the summer of 2010 I spent an exciting three months at MSR working with Krysta Svore, Jaime Teevan and Susan Dumais on problems related to the temporal dynamics of web search.
Here is a more comprehensive list of my publications.
Professional Activities
* On the planning and organizing committee of OurCS 2007
* On the program committee for NERSSEAL 2008
* On the posters committee for SIGIR 2011