Language Technologies Thesis Defense

  • Ph.D. Student
  • Language Technologies Institute
  • Carnegie Mellon University
Thesis Orals

Syntactic Inductive Biases for Natural Language Processing

With the rise in availability of data for language learning, the role of linguistic structure is under scrutiny. The underlying syntactic structure of language allows for composition of simple elements into more complex ones in innumerable ways; generalization to new examples hinges on this structure. We define a syntactic inductive bias as a signal that steers the learning algorithm towards a syntactically robust solution, over others. This thesis explores the need for incorporation of such biases into already powerful neural models of language.

We describe three general approaches for incorporating syntactic inductive biases into task-specific models, under different levels of supervision. The first method calls for joint learning of entire syntactic dependency trees with semantic dependency graphs through direct supervision, to facilitate better semantic dependency parsing. Second, we introduce the paradigm of scaffolded learning, which enables us to leverage inductive biases from syntactic sources to predict a related semantic structure, using only as much supervision as is necessary. The third approach yields general-purpose contextualized representations conditioned on large amounts of data along with their shallow syntactic structures, obtained automatically. The linguistic representations learned as a result of syntactic inductive biases are shown to be effective across a range of downstream tasks, but their usefulness is especially pronounced for semantic tasks.

Thesis Committee:
Noah A. Smith (Co-chair) (CMU/University of Washington)
Chris Dyer (Co-Chair)
Jaime Carbonell
Luke Zettlemoyer (University of Washington)

Copy of Thesis Document

For More Information, Please Contact: