Structured Prediction for Natural Language Processing
The slides are available. Or, watch the tutorial:
This tutorial will discuss the use of structured prediction
methods from machine learning in natural language processing. The
field of NLP has, in the past two decades, come to simultaneously rely on
and challenge the field of machine learning. Statistical methods now
dominate NLP, and have moved the field forward substantially, opening
up new possibilities for the exploitation of data in developing NLP
components and applications. However, formulations of NLP problems
are often simplified for computational or practical convenience, at
the expense of system performance. This tutorial aims to introduce
several structured prediction problems from NLP, current solutions,
and challenges that lie ahead. Applications in NLP are a mainstay at
ICML conferences; many ML researchers view NLP as a primary or
secondary application area of interest. This tutorial will help the
broader ML community understand this important application area, how
progress is measured, and the trade-offs that make it a challenge.
The tutorial will be broken into three parts. The outline below is ambitious; some topics may be referenced only in brief. We intend to give extensive references to important papers, so that participants can follow the leads that are most interesting.
Representations and Data
We will discuss NLP tasks that can be seen as structured prediction
problems. These include sequence segmentation and labeling, syntactic
parsing, and translation discovery. We focus on the representation of
these problems, with some discussion of the data that might be
required for each.
We consider a key abstract inference problem
that turns up frequently in NLP: decoding, also known as maximum a posteriori
inference). We discuss common techniques for decoding.
Supervised structured NLP
We consider the case where training data are available for structured
learning. We discuss the relationship of grammars and automata to
structured prediction, the widespread use of dynamic programming, with some
specific examples. We discuss a variety of approaches to supervised learning of
structured prediction models.
Unsupervised structured NLP
We turn to some trends in unsupervised NLP, where we seek to learn
to predict structure that is not visible in the available data.
We consider the EM
algorithm, some successful models, and variations on EM, including
latent variables, contrastive estimation, and more Bayesian approaches.
Noah Smith is an assistant professor in the Language Technologies
Institute and Machine Learning Department at Carnegie Mellon
University's School of Computer Science. His research interests include
statistical parsing, particularly unsupervised methods for parsing,
multilingual NLP, and applications like machine translation and question answering.
Smith has taught a
semester-long graduate course on the topic (known at CMU as "Language
and Statistics II") for three years running. He received his Ph.D. in
Computer Science from Johns Hopkins University in 2006 as a Fannie and
John Hertz Foundation Fellow. At CMU, he has been the recipient of
research grants from the NSF and DARPA, as well as awards from IBM,
Google, and the Q-Group. He currently serves on the editorial board
of Computational Linguistics.