Structured Prediction for Natural Language Processing


The slides are available. Or, watch the tutorial:



This tutorial will discuss the use of structured prediction methods from machine learning in natural language processing. The field of NLP has, in the past two decades, come to simultaneously rely on and challenge the field of machine learning. Statistical methods now dominate NLP, and have moved the field forward substantially, opening up new possibilities for the exploitation of data in developing NLP components and applications. However, formulations of NLP problems are often simplified for computational or practical convenience, at the expense of system performance. This tutorial aims to introduce several structured prediction problems from NLP, current solutions, and challenges that lie ahead. Applications in NLP are a mainstay at ICML conferences; many ML researchers view NLP as a primary or secondary application area of interest. This tutorial will help the broader ML community understand this important application area, how progress is measured, and the trade-offs that make it a challenge.


The tutorial will be broken into three parts. The outline below is ambitious; some topics may be referenced only in brief. We intend to give extensive references to important papers, so that participants can follow the leads that are most interesting.

Representations and Data

We will discuss NLP tasks that can be seen as structured prediction problems. These include sequence segmentation and labeling, syntactic parsing, and translation discovery. We focus on the representation of these problems, with some discussion of the data that might be required for each.


We consider a key abstract inference problem that turns up frequently in NLP: decoding, also known as maximum a posteriori inference). We discuss common techniques for decoding.

Supervised structured NLP

We consider the case where training data are available for structured learning. We discuss the relationship of grammars and automata to structured prediction, the widespread use of dynamic programming, with some specific examples. We discuss a variety of approaches to supervised learning of structured prediction models.

Unsupervised structured NLP

We turn to some trends in unsupervised NLP, where we seek to learn to predict structure that is not visible in the available data. We consider the EM algorithm, some successful models, and variations on EM, including latent variables, contrastive estimation, and more Bayesian approaches.

Presenter bio

Noah Smith is an assistant professor in the Language Technologies Institute and Machine Learning Department at Carnegie Mellon University's School of Computer Science. His research interests include statistical parsing, particularly unsupervised methods for parsing, multilingual NLP, and applications like machine translation and question answering.

Smith has taught a semester-long graduate course on the topic (known at CMU as "Language and Statistics II") for three years running. He received his Ph.D. in Computer Science from Johns Hopkins University in 2006 as a Fannie and John Hertz Foundation Fellow. At CMU, he has been the recipient of research grants from the NSF and DARPA, as well as awards from IBM, Google, and the Q-Group. He currently serves on the editorial board of Computational Linguistics.