Course Description

Date Lecture Topics Readings Handouts
Module 1: Representation
Wed 9th Sept Lecture 1 : Introduction
Slides Annotated Slides
Introduction to and Examples of Graphical Models
  • Logistics
  • A broad overview
  • The running example: hidden markov model

Chpt. 1
An Introduction to Graphical Models


Mon 14th Sept Lecture 2 : An Introduction to Bayesian Networks
Slides Annotated Slides
Representation of Bayesian Networks
  • Bayesian networks
  • Factorization theorem
  • Local structure and independencies
  • I-MAPs
  • I-equivalence, Minimal I-MAPs, Perfect MAP
  • Examples : Gaussian models, HMM
Chpt. 3, 7.1

Wed 16th Sept Lecture 3 : An Introduction to Undirected Graphical Models
Slides Annotated Slides
Representation of Markov Random Fields
  • Clique potentials
  • Local and global markov independencies
  • Hammersley-Clifford theorem
  • Soundness and completeness in markov random fields
  • Examples : Boltzmann machines, Ising models, gaussian graphical models, conditional random fields
Chpt 4,
Optional : Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

Mon 21st Sept Lecture 4 : A unified view of BN and MRF
Slides Annotated Slides
A unified view of BN and MRF
  • Graphical lasso for structure learning in gaussian graphical models,
  • Minimal I-MAPs from BN to MRF : Markov blanket
  • Minimal I-MAPs from MRF to BN : chordal graphs and triangulation
Chpt 4.5,
Pseudo-Likelihood Based Structure Estimation using Neighborhood Estimation: Neighborhood Selection in Gaussian Graphical Models
Likelihood Based Structures Estimation of GGMs: Glasso

Module 2: Basic Inference and Learning Methods
Wed 23rd Sept Lecture 5 : Learning one-node GM:
Slides Annotated Slides
Learning one-node GMs
  • Parameter learning from IID models
  • Maximum likelihood estimation
  • Bayesian and frequentist parameter estimation
  • Example : Bernoulli model, multinomial model, plate model
  • Hierarchical bayesian estimation for multinomials and gaussians
  • Multinomial model : Dirichlet versus Logistic Normal prior
Chpt 17.1, 17.3
Pattern Recognition and Machine Learning : Chpt 2.3
HW1 out
Mon 28th Sept Lecture 6 :
Learning two-node GM Slides Annotated Slides
Two-node graphical models
  • Generative v/s discriminative classifiers.
  • Optimal classification via bayes classifier
  • Maximum likelihood v/s bayesian estimation of conditional gaussians.
  • Linear regression
  • Least mean squares or Widrow-Hoff learning rule.
  • Bayesian linear regression and regularized regression
Required : On Discriminative vs. Generative Classifiers: A comparison of logistic regression and Naive Bayes.
Pattern Recognition and Machine Learning : Chpt 1.2, 9.2, 9.3

Wed 30th Sept Lecture 7 : Exponential Families
Slides Annotated Slides
Generalized Linear Models (GLIM)
  • Exponential Families
  • Examples : linear regression, logistic regression, multivariate gaussian distribution, multinomial distribution
  • Moment estimation
  • MLE and batch learning for GLIMs
  • Iteratively weighted least squares
  • MLE for BNs : decomposable likelihood
  • Relationship with KL divergence
  • Supervised parameter estimation for HMMs
Chpt 17.2, 17.3, 17.4
Additional Readings :
1. Parameter Priors for Directed Acyclic Graphical Models and the Characterization of Several Probability Distributions.
2. A Characterization of the Dirichlet Distribution Through Global and Local Parameter Independence

Mon 5th Oct Lecture 8 :
Variable Elimination
Slides Annotated Slides
Inference via Elimination
  • Probabilistic inference : likelihood, conditional probability, most probable assignment
  • Marginalization and elimination
  • Examples : Elimination on chains, hidden markov models, and CRFs
  • Sum-product operation
  • Dealing with evidence variables
Chpt 9.1, 9.2, 9.3, 9.4

Wed 7th Oct Lecture 9 : Belief Propagation
Slides Annotated Slides
Belief Propagation
  • From elimination to message passing
  • Message passing for trees
  • Correctness of Belief Propagation
  • Parallel synchronous versus sequential implementation
  • Factor graphs
  • Max product algorithm
Chpt. 10.1, 10.2, 10.3
A useful tutorial is here.
HW1 due.
Hw2 out
Project Proposal due.
Mon 12th Oct Lecture 10 : Junction Trees
Slides Annotated Slides
Junction Trees
  • From Elimination to Message Passing
  • Junction Tree Algorithm
  • Triangulation
  • Case Study : Hidden Markov Model
  • Forward Backward Algorithm
  • Viterbi Algorithm
Chpts 10.1, 10.2, 10.3, 11.3, 10.4,
Forward Backward Search Algorithm,
Viterbi Algorithm

Wed 14th Oct Lecture 11 : Expectation-maximization algorithm
Slides Annotated Slides
Expectation-maximization algorithm
  • Mixture Models
  • Gaussian Mixture Models (GMMs)
  • Expectation Maximization(EM) - learning from partially observed data
  • Lower bounds and free energy
  • EM for HMMs - Baum Welch algorithm
  • EM for general BNs, conditional mixture-of-experts model
Chpts. 19.1, 19.2.2, 19.2.3
A tutorial on HMMs
Some interesting aspects of EM

Module 3 : Case Studies : Popular Graphical Models
Mon 19th Oct Lecture 12 : HMM and CRF
Slides Annotated Slides
  • Hidden Markov Models
  • Forward Backward Algorithm (Junction Tree algorithm)
  • Viterbi decoding
  • Supervised and unsupervised learning (Baum-Welch) for HMMs
  • Maximum entropy markov models, label bias
  • Conditional Random Fields
  • CRF inference and learning
1. A tutorial on HMMs
2. CRF Tutorial by Hanna Wallach
3. The original CRF paper
4. Shallow Parsing with CRFs

Wed 21st Oct Lecture 13 : Multivariate Gaussian models, Gaussian graphical models
Slides Annotated Slides
Gaussian graphical models
  • Covariance v/s precision matrix in GGMs
  • Sparse covariance matrix v/s sparse precision matrix
  • The Meinshausen-Buhlmann (MB) algorithm : graph regression
  • L1-regularized maximum likelihood learning
  • KELLER: Kernel Weighted L1-regularized Logistic Regression
1. Covariance Selection - the original GGM paper by Dempster
2. Meinshausen-Buhlmann algorithm
3. Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian
4. Glasso
HW2 due.
HW3 out.
Mon 26th Oct Lecture 14 : State space models
Slides Annotated Slides
  • Factor Analysis
  • Constrained Covariance Gaussian
  • Inference and EM for factor analysis
  • Independent Components Analysis
  • State Space Models(SSMs)
  • Online v/s Offline Inference in SSMs
  • Kalman FIlter
  • Rauch-Tung-Strievelsmoother
  • Nonlinear systems : extended Kalman filter
1. Chpts 15.4,
2. An introduction to the Kalman filter
3. Variational Learning for Switching State-Space Models
4. A discrete state-space model for linear image processing

Wed 28th Oct Lecture 15 : Complex Graphical Models
Slides Annotated Slides
  • Complex Dynamic Networks
  • Dynamic Bayesian Networks (DBNs)
  • Factorial HMM (fHMM), switching HMMs, Hidden Markov Decision Trees
  • Switching SSMs
  • Junction tree for coupled HMMs
  • Latent Semantic Indexing, Topic Models, Admixture Models, Mixed Membership Models, Latent Dirichlet Allocation
1. Latent Semantic Indexing
2. Dynamic Bayesian Networks
3. Factorial Hidden Markov Models
4. Latent Dirichlet Allocation

Module 4: Approximate Inference
Mon 2nd Nov Lecture 16 : Variational inference I
Slides Annotated Slides
  • Energy Functional, KL Divergence
  • Bethe Approximation to Gibbs Free Energy
  • Bethe = BP on Factor Graphs
  • Loopy Belief Propagation
  • Region-based Approximations to the Gibbs Free Energy (Kikuchi)
  • Generalized Belief Propagation
1. Chpt. 11.1, 11.2, 11.3
2. Stable fixed points of loopy belief propagation are minima of the Bethe free energy

Wed 4th Nov Lecture 17 : Variational inference II
Slides
  • Mean parametrization for exponential family GMs
  • Variational inference
  • Bethe variational inference, connection to sum-product
  • Kikuchi approximation
  • Mean Field and KL divergence

1. Chpt. 11
2. Bethe free energy, Kikuchi approximations, and belief propagation algorithms
HW3 due.
HW4 out.
Midway progress report due.
Mon 9th Nov Lecture 18 : Monte Carlo 1
Slides Annotated Slides
Monte Carlo methods
  • Direct sampling
  • Rejection sampling, importance sampling
  • Likelihood weighting
  • Rao-Blackwellised sampling
Chpt. 12.1, 12.2
Wed 11th Nov Lecture 19 : Monte Carlo 2
Slides Annotated Slides
  • Markov Chains
  • Metropolis Hasting
  • Gibbs sampling
Chpt. 12.3, 12.4

Module 5 : Advanced learning methods
Mon 16th Nov Lecture 20 : Applications 1 : Topic Models
Slides Annotated Slides
  • Structured and semantic-driven browsing of dynamic, multi-modal information
  • Examples : Ideological polarity, total scene understanding, machine translation, topic evolution
  • Mixed membership models aka topic models
  • Bayesian inference : variational inference, collapsed gibbs sampling.
  • Joint topic and perspective models
  • Evolving social networks
  • Mixed membership stochastic block model, generalized mean field.
1. Latent dirichlet alloc ation : David M. Blei, Andrew NG, Michael Jordan
2. A correlated topic model of Science : David M. Blei and John D. Lafferty
3. On Tight Approximate Inference of Logistic-Normal Admixture Model : Amr Ahmed and Eric P. Xing
4. An introduction to variational methods for graphical models : MI Jordan, Z Ghahramani, TS Jaakkola, LK …
5. Graph partition strategies for generalized mean field inference : E.P. Xing, M.I Jordan and S. Russell
6. Finding scientific topics : Griffiths, Steyvers
7. A Joint Topic and Perspective Model for Ideological Discourse : W.-H. Lin, E. P. Xing, and A. Hauptmann
8. Towards Total Scene Understanding:Classification, Annotation and Segmentation in an Automatic Framework : L.-J. Li, R. Socher and L. Fei-Fei
9 HM-BiTAM: Bilingual Topic Exploration, Word Alignment, and Translation : B Zhao and E P Xing
10. Mixed membership stochastic block models for relational data, with applications to protein-protein interactions : E.M Airodi, D.M. Blei, E.P. Xing and S.E. Fienberg

Wed 18th Nov Lecture 21 : MLE of undirected graphical models
Slides Annotated Slides
MLE for graphical models
  • Conditions on clique marginals for MLE estimation
  • MLE for decomposable undirected models
  • Iterative proportional fitting
  • IPF minimizes KL divergence
  • MLE of feature based models : Generalized Iterative Scaling(GIS)
  • Maximum entropy formulation
1. Chpt. 20.1, 20.2, 20.3
Generalized iterative scaling for log-linear models

Mon 23rd Nov Lecture 22 : Max-margin learning of graphical models
Slides Annotated Slides
  • Conditional Random Fields (CRFs)
  • Max-margin Markov Networks (M3Ns)
  • Large Margin Estimation, Min-max Formulation
  • Primal and Dual Problems of M3Ns
  • Maximum Entropy Discrimination Markov Networks
  • Gaussian/Laplacian MaxEnDNet
  • Supervised Topic Models
1. Max-Margin Markov Networks
2. Laplace Maximum Margin Markov Networks
3. MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification
HW 4 due.
Wed 25th Nov No Class




Mon 30th Nov Lecture 23 : Nonparametric Bayesian Models
Slides Annotated Slides
  • Model selection vs. posterior inference
  • Relationship between dirichlet process and infinite mixtures
  • Dirichlet process, stick breaking, chinese restaurant process
  • Approximate inference via MCMC, variational inference. eg. haplotype inference
  • Hierarchical dirichlet process and multi-task clustering
  • Hidden markov dirichlet process, temporal DPM
1. Bayesian Haplotype Inference via the Dirichlet Process
2. Variational inference for Dirichlet process mixtures
3. Collapsed variational Dirichlet process mixture models
4. Hierarchical dirichlet processes
5. Hidden Markov Dirichlet Process: Modeling Genetic Recombination in Open Ancestral Space
Project Poster Session
Wed 2nd Dec Lecture 24 : How to put things together
Slides Annotated Slides
  • Representation, model semantics.
  • Topic models, choice of priors
  • LoNTAM variations inference
  • Evaluation, testing inference
  • Deterministic annealing
  • Supervised LDA, medLDA

Final Project Report due.
 

© 2009 Eric Xing @ School of Computer Science, Carnegie Mellon University
[validate xhtml]