Back to Home Page

Ph.D. Thesis: Learning Large-Scale Conditional Random Fields

Machine Learning Department, Carnegie Mellon University, 2013.
Thesis (PDF)
Defense Slides (PPT)

My thesis is on tractable learning methods for large-scale Conditional Random Fields (CRFs) (Lafferty et al., 2001). CRFs are Probabilistic Graphical Models of conditional distributions P(Y|X), where Y and X are sets of random variables. My thesis has three parts: CRF parameter learning, CRF structure learning, and parallel learning for CRFs.

Conditional Random Field (CRF) Parameter Learning

I am researching tractable methods for learning parameters of CRFs with arbitrary structures. We use decomposable learning methods (composite likelihood) which do not require intractable inference during learning, but which also come with strong theoretical guarantees for finite sample sizes.

CRF Structure Learning

I am also researching learning tractable (low-treewidth) structures for CRFs. Up to now, little work has been done on CRF structure learning, but our techniques permit efficient learning of tree structures. We do very well at recovering ground-truth models from synthetic data and on an fMRI application.

Parallel Learning for CRFs

My research on CRF parameter and structure learning uses methods which break large problems down into smaller regression problems. I am researching parallel methods for sparse regression which take advantage of sparsity to improve parallel performance. Our ICML paper below uses statistical properties of data to permit parallel optimization for multicore computing, and my current research is examining methods for distributed computing which communicate sparse information.