Information Processing and Learning

10-704, Fall 2016

Aarti Singh

Teaching Assistant: Shashank Singh
Class Assistant: Sandra Winkler


Monday and Wednesday, 3:00 - 4:20 pm, GHC 4102


Thursday 5:30-6:30 pm, GHC 4303

Office hrs:

Aarti: Friday 10-11 am GHC 8207
Shashank: Tuesday 3:30-4:30 pm GHC 7605

Course Description:

What's the connection between how many bits we can send over a channel and how accurately we can classify documents or fit a curve to data? Is there any link between decision trees, prefix codes and wavelet transforms? What about max-entropy and maximum likelihood, or universal coding and online learning?

This inter-disciplinary course will explore these and other questions that link the fields of information theory, signal processing, and machine learning, all of which aim to understand the information contained in data. The goal is to highlight the common concepts and establish concrete links between these fields that enable efficient information processing and learning.

We will do a short but introductory review of basic information theory, including entropy and fundamental limits of data compression, data processing and Fano's inequalities, channel capacity, and rate-distortion theory. Then we will dig into the connections to learning including: estimation of information theoretic quantities (such as entropy, mutual information, and divergence) and their applications in learning, information theoretic lower bounds for machine learning problems, duality of max entropy and maximum likelihood, connections between clustering and rate-distortion theory, universal coding and online learning, active learning and feedback channels, and more.

We expect that this course will cater to both students that have taken a basic information theory course and those that have not.

Prerequisites: Fundamentals of Probability, Statistics, Linear Algebra and Real analysis

  • 3 Homeworks (30%)
  • Project (20%)
  • 6 Online Q&A (30%)
  • 2 In-class Quiz (20%)
Homeworks: All homeworks, Q&As, quizzes and solutions will be posted here.

Course Project: Information about the course project is available here.

Schedule of Lectures: Lecture schedule, scribed notes and HWs are available here.

Tentative Schedule:
Week 1: Basics of information theory
Week 2: Some applications in machine learning
Week 3: Estimation of information theoretic quantities
Week 4: Maximum entropy distributions and exponential families, I-geometry
Week 5: Source coding and compression
Week 6: Model selection and connections to source coding
Week 7: Universal source coding and online learning
Week 8: Sufficient statistics, Information Bottleneck principle
Week 9: Channel coding and Redundancy-Capacity Theorem
Week 10: Fano's Inequality and minimax theory
Week 11: Minimax lower bounds for parametric and nonparametric problems
Week 12: Strong data processing inequalities and minimax lower bounds
Week 13: Classification and hypothesis testing
Week 14: Graphical models, LDPC codes, and active learning