Wednesday, April 5th, 2006 - 12:00, NSH 3002
Title: A Graphical Framework for Contextual Search and Name Disambiguation in Email
Speaker: Einat Minkov

Abstract:

Similarity measures for text have historically been an important tool for solving information retrieval problems. In many interesting settings, however, documents are often closely connected to other documents, as well as other non-textual objects in structure-rich data. In this paper we consider extended similarity metrics for documents and other objects embedded in graphs, facilitated via a lazy graph walk. We provide a detailed instantiation of this framework for email data, where content, social networks and a timeline are integrated in a structural graph. We provide evaluation for two email-related problems: disambiguating names in email documents, and threading. We show that re-ranking schemes based on the graph-walk similarity measures often outperform baseline methods, and that further improvements can be obtained by use of appropriate learning methods.

This is joint work with William W. Cohen and Andrew Y. Ng.