Machine Learning Thesis Defense

Thesis Orals

Teaching Machines to Classify from Natural Language Interactions

Humans routinely learn new concepts using natural language communications, even in scenarios with limited or no labeled examples. For example, a human can learn the concept of a phishing email from natural language explanations such as `phishing emails often request your bank account number'. On the other hand, purely inductive learning systems typically require a large collection of labeled data for learning such a concept. If we wish to make computer learning as efficient as human learning, we need to develop methods that can learn from natural language interactions. In this work, our thesis is that advances in Natural Language Understanding and the growing ubiquity of computing devices together can enable people to teach computers classification tasks using natural language interactions.

Learning from language presents some key challenges. A preliminary challenge lies in the basic problem of learning to interpret language, which refers to an agent’s ability to map natural language explanations in pedagogical contexts to formal semantic representations that computers can process and reason over. A second challenge is that of learning from interpretations, which refers to the mechanisms through which interpretations of language statements can be used by computers to solve learning tasks in the environment. For learning from interpretation, we focus on concept learning (binary classification) tasks. We demonstrate that language can define rich and expressive features for learning tasks,  and show that machine learning can benefit substantially from this ability. We also investigate assimilation of linguistic cues in everyday language that implicitly constrain models for concept learning (e.g., `Most emails are not phishing emails' ). In particular, we focus on quantifier expressions (such as usually, never, etc.) that reflect generality of specific observations, and can be incorporated into training of classification models alleviating the need for labeled data.

Apart from developing computational machinery that uses interpretations of language advice to guide concept learning, we develop complementary algorithms for learning to interpret language by incorporating different types of situational context, including  conversational history and sensory observations. We show that environmental context can enrich models of semantic interpretation by not only providing discriminative features, but also reducing the need for expensive labeled data used for training them. Another valuable attribute of human language is that it is inherently conversational and interactive. We also briefly explore the possibility of agents that can learn to interact with a human teacher in a mixed-initiative setting, and learning from a mix of both examples and explanations.  Here the learner can also proactively engage the teacher by asking questions, rather than only passively listen to explanations. We show that learning to ask appropriate questions from a teacher can reduce the sample complexity of concept learning.

Thesis Committee:
Tom Mitchell (Chair)
Taylor Berg-Kirkpatrick
William Cohen
Dan Roth (University of Pennsylvania)

Copy of Thesis Draft Document

For More Information, Please Contact: