Information Processing and Learning

10-704, Spring 2012

Aarti Singh

Teaching Assistant: Min Xu
Class Assistant: Michelle Martin


Tuesday and Thursday, 10:30 - 11:50 am, 4303 GHC, (Notes)


Wednesday 3-4pm, 8102 GHC

Office hrs:

Instructor: Mondays 3-4 pm, 8207 GHC
TA: Friday 2-3 pm, 8013 GHC Atrium

Course Description:

What's the connection between how many bits we can send over a channel and how accurately we can classify documents or fit a curve to data? Is there any connection between decision trees, prefix codes and wavelet transforms? What about error-correcting codes, graphical models and compressed sensing?

This course will explore such questions that link the fields of signal processing and machine learning, both of which aim to extract information from signals or data. The goal of this inter-disciplinary course is to highlight the concepts common to these fields that together enable efficient information processing and learning.

The topics will range from basics of information theory, entropy and fundamental limits of data compression, channel capacity & least informative priors, rate-distortion theory, Kolmogorov complexity & online learning, hypothesis testing - information theoretic limits and lower bounds in machine learning, sequential testing, function approximation using fourier and wavelet transforms, as well as advanced topics including connections between error-correcting codes, inference in graphical models and compressed sensing, as time permits.

Prerequisites: Fundamentals of Probability, Statistics, Linear Algebra and Real analysis

  • Homeworks (40%)
  • Project (35%)
  • Two Short Quiz (15%)
  • Scribing (10%)
Tentative Syllabus Outline: This outline is subject to significant revision during the lectures. For actual lecture topics and notes, please see here.

Jan 16 - May 4 (15 weeks + 1 week spring break)
Information Theoretic Foundations
week 1 - Introduction
Basics of info theory - entropy, relative entropy and mutual information
week 2 - Data processing inequality & Sufficient statistics
Fano's Inequality
week 3 - Max entropy distributions & Exponential families
Asymptotic equipartition property
week 4 - Source coding/fundamental limits of data compression
Prefix codes & Kraft Inequality
week 5 - Lossy source coding & rate distortion theory
week 6 - Channel capacity & least informative priors
Joint source channel coding
week 7 - Universal source coding & Online learning
Kolmogorov complexity, Occam's razor & minimum description length
Decision Theory
week 8 - hypothesis testing - Likelihood Ratio Tests, GLRT, Neyman Pearson framework
week 9 - information theoretic limits of hypothesis testing
& lower bounds in machine learning problems
week 10 - Multiple hypothesis testing
FWER (Family Wise Error Rate), FDR (False Discovery Rate)
week 11 - sequential/active testing
Estimation theory
week 12 - function spaces & approximation theory
linear and nonlinear estimators
week 13 - wavelets & decision trees
Advanced topics
week 14 - connections between error-correcting codes, message passing,
inference in graphical models & compressed sensing
week 15 - project presentations