Matthew W. BilottiPh.D. Student Language Technologies Institute School of Computer Science Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 USA Office: 4533 Newell-Simon Hall Telephone: +1 412 268 9515 Fax: +1 412 268 6298 Email: mbilotti ·at· cs · cmu · edu |
|
About me: I am a fifth year Ph.D. student at the LTI, advised by Eric Nyberg.
My research interest centers on the application of Information Retrieval (IR) technologies as text search subcomponents embedded in larger, text-aware language technologies applications, such as Question Answering (QA) systems. Most IR systems are optimized to provide a quality ad hoc retrieval experience for the user, but few directly support other applications as users. Simple text representations and queries (eg. bag-of-words) are not sophisticated enough to retrieve text that matches the deeper representation used internally by the application. In practice, these application derive a simple query from the more complex representation that they use internally and exhaustively post-process the retrieved results, filtering out those that do not satisfy the constraints.
This is particularly true of QA systems, which often have a rich representation for describing what would constitute good answers to the questions they receive. Many QA systems that use bag-of-words IR are frustrated by poor quality of the retrieved results, which contain a large number of irrelevant documents that match the query but do not contain answers. The system must post-process all of these texts to obtain the deeper analyses necessary to determine which of them satisfy the linguistic and semantic constraints that indicate the presence of a good answer, and which do not.
My research goal is to bring QA and IR closer together by applying annotations based retrieval techniques to the problem of retrieval for QA. Much of the linguistic and semantic information used internally by a QA system can be represented as annotations on text. Arbitrary annotations, including overlapping and hierarchical annotations, can be stored as fields in the index. At query time, constraints expressed in terms of these fields and relationships among them can be used to retrieve text that not only matches certain key terms, but that also satisfies higher level constraints. This technique can markedly improve the quality of the results retrieved for the QA system.
I am formerly of MIT CSAIL, where I completed my undergraduate degree, and my M.Eng, supervised by Dr. Boris Katz. My thesis was titled "Query Expansion Techniques for Question Answering", and concerned itself with query relaxation strategies designed to maximize recall of relevant documents in a pipelined QA system.
Click here for my publications.