Yubin Kim

Senior Data Scientist at UPMC Enterprises

UPMC Enterprises
6425 Penn Ave, Suite 200
Pittsburgh, PA 15206
kimy10 (at-domain) upmc dot edu

Selective Search

My dissertation topic was selective search. This project aims to alleviate the computational burden of small research institutions and start-up companies that need to perform search operations on web-scale indexes. By creating shards based on document similarity and performing judicious resource selection, computational costs of search can be reduced greatly by searching only the top k shards rather than the entire collection.

Related papers:

Twitter Search

In a joint effort with Reyyan Yeniterzi, we participated in the ad-hoc search task of the Microblog Track of TREC 2012. The focus of our efforts was directed to addressing the vocabulary mismatch problem between the query and tweets. Two proposed solutions include query expansion through pseudo-relevance feedback and document expansion using URLs present in tweets. The resulting system was competitive and our best run placed in the top 10 of automatic runs.

Related paper:

Slow Search

During my internship at Microsoft Research in the summer of 2013, I worked on slow search with Jaime Teevan and Kevyn Collins-Thompson. The project explored the benefits of taking more time with search retrieval. Specifically, we worked on integrating crowdsourcing into the search pipeline to improve search accuracy and to enable better search result summarization for entity queries.

Related papers:

Other

Previously, I have done research work in event detection, databases, and search engine internals. For more information, please see my curriculum vitae.