Probabilistic Graphical Models

10-708, Fall 2007

Eric Xing

School of Computer Science, Carnegie-Mellon University

Syllabus and Course Schedule



Lectures, readings, online materials

Homeworks, Exams, Deadlines

Module 0: introduction

(1 Lectures)

·      Lecture 0: 9/10/07  (Xing)

Slides (annotated slides)


0.     Introduction

a.     Logistics

b.     A broad overview,

c.     The running example: hidden Markov model




Module 1: Fundamentals of Graphical Models:

basic representation, inference, and learning

Basic representation
(3 Lectures

  • Lecture 1: 9/12/07  (Xing)

Slides (annotated slides)


1.     Representation of directed GM

a.     Independencies in directed graphs

b.     The Bayes-ball algorithm

c.     I-map

d.     Tabular CPD

e.     Equivalence theorem




  • Lecture 2: 9/17-19/07  (Xing)  

Slides (annotated slides)


2.     Representation of undirected GM

a.     Independencies in undirected graphs

b.     I-map

c.     Factors and potentials

d.     Hamersley-Clifford theorem

e.     Parameterization and Over-parameterization, Factored graphs



hw1 out

  • Lecture 3 9/24/07  (Xing)

Slides (annotated slides)


3.     Unified view of BN and MRF

a.     The representational difference between BN and MRF

b.     Moral graph, chordal graph, triangluation, and clique trees

c.     Partially directed models

d.     CRF


Recitation: Review of Lectures 1-3, 21st Sept



Exact Inference
(3 Lectures
  • Lecture 4 9/26/07  (Xing)

Slides (annotated slides)


4.     Elimination algorithm

a.     The basic algorithm

b.     Dealing with algorithm

c.     Complex

d.     Ordering

e.     MAP queries

f.      Example: HMM


Recitation: Review of Lectures 3-4, 27st Sept



  • Lecture 5: 10/1/07  (Xing)

Slides (annotated slides)


5.     Message-passing algorithm on trees and on the original graph

a.     The algorithm: sum-product and belief update

b.     Incorrectness of SP on general graph

c.     Example:




  • Lecture 6: 10/3/07  (Xing)

Slides (annotated slides)


6.     Junction tree algorithm

a.     Clique tree from variable elimination

b.     Clique tree from the chordal graph

c.     The junction tree property

d.     The calibrate clique tree and a distribution

e.     The junction tree algorithm

Recitation: Review of Max-Product and Junction Tree , 4th Oct



Learning I: Bayesian Networks
(4~5 Lectures
  • Lecture 7: 10/8/07  (Xing)

Slides (annotated slides)


7.     Learning one-node GM

a.     Plates,

b.     Multinomial and Gaissian

c.     Density estimation (of discrete and continuous models)

d.     Maximum likelihood estimators (frequentist approach)

e.     Bayesian approaches.



hw2 out

hw 1 in.

  • Lecture 8: 10/10/07  (Xing)

Slides (annotated slides)


8.     Learning two-node GM

a.     Regression and classification

b.     Generative models and discriminative models

c.     Supervised learning

Recitation: Review of LMS and geometric interpretations, 11th Oct


 Project description due

  • Lecture 9: 10/15/07  (Xing)

Slides (annotated slides)


9.     Learning tabular CPT of structured full BN,

a.     Parameter independence and Global decomposition

b.     Local decomposition

c.     Formal foundation: sufficient statistics and exponential family




  • Lecture 10: 10/17/07  (Xing)

Slides (annotated slides)


10.  Structure learning

a.     Structure scores

b.     Search

c.     Chow-Liu algorithm

d.     Context-specific independence

Recitation: Review of Exponential Families, GLIMs, 18th Oct


  • Lecture 10 (continue): 10/22/07  (Xing)

Slides (annotated slides)


hw 3 out

hw 2 in.

  • Lecture 11: 10/24/07  (Xing)

Slides (annotated slides)


11.  EM: learning from partially observed data

a.     Two node graphs with discrete hidden variables;

b.     Mixture models and clustering problems;

c.     Unsupervised learning and the EM algorithm.



Case studies: Popular Bayesian networks and MRF


  • Lecture 12: 10/29/07  (Xing)

Slides (annotated slides)


HMM revisit (brief, to be covered fully in recitation)

a.     Forward-backward

b.     Viterbi

c.     Baum-Welsh

d.     A case study: gene finding from DNA sequence

CRF model

12.  CRF

a.     Representation

b.     CRF vs HMM

Reading: Lafferty et al Sha et al


  • Lecture 13: 10/31/07  (Xing)

Slides (annotated slides)


Multivariate Gaussian Models


13.  Factor analysis

a.     Manipulating multivariate Gaussian

b.     Some useful linear algebra and matrix calculus

c.     Factor analysis and the EM algorithm



mid term project milestone due


  • Lecture 14: 11/05/07  (Ramesh/Xing)

Slides (annotated slides)

Temporal models


14.  Kalman filter

a.     RTS algorithm

b.     The junction tree version of RTS



  • Lecture 15: 11/05/07  (Ramesh/Xing)

Slides (annotated slides)


Complex Graphical Models models


16.  Overview of intractable popular BNs and MRFs

a.     Dynamic Bayesian networks,

b.     Bayesian admixture models (LDA)

c.     Ising models

d.     Feature-based models

e.     General exponential family models

f.      The need for approximate inference


Reading: fHMMs, , Switching SSM, LDA

Approximate Inference


  • Lecture 16: 11/7/07  (Hetu/Xing)

Slides (annotated slides)
Eric's slides


17.  Variational inference 1:

a.     Theory of loopy belief propagation,

b.     Bethe free energy, Kikuchi

c.     Case study on Ising model


Reading: Yedidia et al, 2004

hw4 out,

hw3 in.


  • Lecture 17: 11/12-14/07  (Xing)

Slides (annotated slides)


18.  Variational inference 2:

a.     Theory of mean field inference,

b.     Lower bounds,

c.     Structured mean field algorithms,

d.     Variational Bayesian learning,

e.     The generalized mean field algorithm.

f.      Case study on LDA


Generalized Mean Field
Graphical Models, exponential families, and variational inference
Variational Bayes


  • Lecture 18: 11/19/07  (Xing)

Slides (annotated slides)


19.  Monte Carlo inference 1:

a.     Intro to sampling methods,

b.     Importance sampling,

c.     Weighted resampling,

d.     Particle filters.




  • 11/21/07 Thanksgiving, no class

  • Lecture 19: 11/26/07  (Xing)

Slides (annotated slides)


20.  Monte Carlo inference 2:

a.     Collapsed sampling;

b.     Markov chain Monte Carlo,

c.     Metropolis Hasting algorithm,

d.     Gibbs sampling algorithm, convergence test;

e.     The data augmentation algorithm and EM.



Lecture 20: 11/26/07  (Xing)

Slides (annotated slides)

21.  Nonparametric Bayesian model

a.     Dirichlet Process

b.     Dirichlet process mixtures and infinite mixture models

c.     Infinite HMM


A Constructive Definition of Dirichlet Priors
MCMC methods for Dirichlet Process Mixture Models
Variational methods for the Dirichlet Process

  • Lecture 21: 11/28/07  (Xing)

Slides (annotated slides)


21.  Learning undirected graphical models

a.     CRFs and general Markov networks

b.     Iterative scaling learning

c.     Contrastive divergence algorithms




hw 4 in,

  • Lecture 22: 11/28/07  (Xing)

Slides (annotated slides)


22.  Learning in structured input-output space:

a.     Maximum-margin Markov networks




  • Lecture 22:  (Xing) --- canceled

Slide (annotated slides)


23.  Discussion on discriminative learning objective: a unified view

a.     Likelihood-based and maximum margin learning,

b.     Convex optimization

c.     Learning algorithms




11/30/07: Poster presentation of final project

  • Lecture 23: 12/3/07  (Xing)

Slides (annotated slides)


24.  Applications






--Submit final report

--Take-home final out

Final Exam (Take home)

All material thus far

--Take-home final due (12/13/07)

Recitation Schedule







Sep 21 

5-6 pm 

WeH 4623 

Lectures 1-3 

Sep 27 


NSH 3001 

Lectures 3-4 

Oct 4

6-7 pm

NSH 3001

BP,Junction Tree

Oct 11

6-7 pm

Weh 4623


Oct 18

6-7 pm

NSH 3001

Exp Family, GLIMs

Oct 25th

6-7 pm

NSH 3001

Structure Learning

Nov 1st

6-7 pm

NSH 3001

CRFs and Factor Analysis

Nov 8th

6-7 pm

NSH 3001

Complex Graphical Models