Geoffrey J. Gordon, Le Song, and Byron Boots


Slides, sample code, and videos are now available (see below on this page).

Topic Overview

Many problems in machine learning involve collecting high-dimensional multivariate observations or sequences of observations, and then hypothesizing a compact model which explains these observations. An appealing representation for such a model is a latent variable model that relates a set of observed variables to an additional set of unobserved or hidden variables.

Examples of popular latent variable models include latent tree graphical models and dynamical system models, both of which occupy a fundamental place in engineering, control theory, economics as well as the physical, biological, and social sciences. Unfortunately, to discover the right latent state representation and model parameters, we must solve difficult structural and temporal credit assignment problems. Work on learning latent variable structure has predominantly relied on likelihood maximization and local search heuristics such as expectation maximization (EM); these heuristics often lead to a search space with a host of bad local optima, and may therefore require impractically many restarts to reach a prescribed training precision.

This tutorial will focus on a recently-discovered class of spectral learning algorithms. These algorithms hold the promise of overcoming these problems and enabling learning of latent structure in tree and dynamical system models. Unlike the EM algorithm, spectral methods are computationally efficient, statistically consistent, and have no local optima; in addition, they can be simple to implement, and have state-of-the-art practical performance for many interesting learning problems.

We will describe the main theoretical, algorithmic, and empirical results related to spectral learning algorithms, starting with an overview of linear system identification results obtained in the last two decades, and then focusing on the remarkable recent progress in learning nonlinear dynamical systems, latent tree graphical models, and kernel-based nonparametric models.

Target Audience

This tutorial is meant for a broad audience, including students and researchers interested in machine learning in general, or specifically interested in recent advances in spectral learning algorithms. No specific knowledge will be required beyond basic linear algebra, probability, and statistics; the tutorial is self-contained and most of the foundational concepts are introduced during the presentation.

The participants will learn about the fundamental aspects of spectral learning algorithms applied to a diverse set of problems, including system identification for linear and non-linear dynamical systems, latent tree graphical models, and kernel-based non-parametric graphical models. We will include a concise presentation of the main algorithms, the most recent theoretical results, and an overview of empirical results in the above domains. We will provide sample code for some of the demonstrations, to allow participants to experiment on their own with effective spectral learning algorithms.

Content Details

We now have both slides and sample code available. The slides are in 3 parts:

  • Part I: Geoff Gordon presents an introduction and overview of spectral learning
  • Part II: Byron Boots presents spectral learning for time series data (HMMs, Kalman filters, etc.)
  • Part III: Le Song presents spectral learning for latent trees

The sample code (for learning from time series data) is here.

Unfortunately, the talk videos are currently missing, due to a problem at the hosting end: see this post for more info.  If the videos return, they will be here.

About the Organizers

[mug shot]

Geoff Gordon is an Associate Research Professor in the Department of Machine Learning at Carnegie Mellon University, and co-director of the Department's Ph. D. program. He works on multi-robot systems, statistical machine learning, game theory, and planning in probabilistic, adversarial, and general-sum domains. His previous appointments include Visiting Professor at the Stanford Computer Science Department and Principal Scientist at Burning Glass Technologies in San Diego. Dr. Gordon received his B.A. in Computer Science from Cornell University in 1991, and his Ph.D. in Computer Science from Carnegie Mellon University in 1999.

Le Song is an Assistant Professor in the College of Computing at Georgia Institute of Technology. He works on kernel methods and nonparametric graphical models, with applications to computer vision, computational biology and social media problems. His previous appointments include Research Scientist at Google Research and Postdoctoral Fellow at Carnegie Mellon University. Dr. Song received his Ph.D. in computer science from the University of Sydney in 2008. He has given guest lectures on Computational Biology and Graphical Models at Carnegie Mellon University and he is now teaching Advanced Topics in Machine Learning at Georgia Institute of Technology.
Byron Boots is a Ph.D. candidate in the Machine Learning Department at Carnegie Mellon University. He works on spectral system identification methods, predictive state representations, reinforcement learning, and robotics. Prior to becoming a graduate student, he was an Associate in Research in the Center for Cognitive Neuroscience at Duke University and an engineer at MobileRobots Inc. He received his B.A. in Computer Science and Philosophy from Bowdoin College in 2003, and an M.S. in Machine Learning from Carnegie Mellon University in 2009.