This page lists the readings for each lecture. The instructors will include comments and pointers to other resources that might be helpful to get the most out of the readings.

- (Bishop - 2.1) This section gives many details on the Bayesian and maximum likelihood results for the binomial example Carlos covered today.

Recitation 1 -- Probability Review

- (Bishop - 1.2) A good review of the probability concepts needed for this course
- We have not checked all of these articles for correctness, but we do recommend brushing up with the Wikipedia articles for these topics:

- (Bishop - 1.1 to 1.4) Introduces curve fitting, reviews probability theory, introduces Gaussians, and covers the famous "curse of dimensionality"
- (Bishop - 3.1, 3.1.1, 3.1.4, 3.1.5, 3.2, 3.3, 3.3.1, 3.3.2) Regression, linear basis function models, bias-variance decomposition, and Bayesian linear regression

- (Bishop - 3.2) Bias-variance decomposition
- (Bishop - 1.5.5) Covers loss functions for regression and discusses minimizing expected loss
- (Bishop - 1.3) Discusses model selection using a test set
- Mitchell Chapter (Sections 1 and 2): Mitchell's Chapter on Naive Bayes and Logistic Regression

- Mitchell Chapter (All sections): Mitchell's Chapter on Naive Bayes and Logistic Regression
- Optional Reading: Ng and Jordan's NIPS 2001 paper on Discriminative versus Generative Learning

- (Bishop - 14.4) Tree-based Models
- Recommended Reading: Nils Nilsson's Chapter (All Sections): Decision Trees
- Optional Review of Boolean Logic/DNF: Nils Nilsson's Chapter Boolean Functions (first 4 pages)

- (Bishop - 14.3) Boosting
- Schapire's Boosting Tutorial
- (Bishop - 1.3) Model Selection (Cross Validation)

- (Bishop 1.3) Model Selection / Cross Validation
- (Bishop 3.1.4) Regularized least squares
- (Bishop 5.1) Feed-forward Network Functions

- (Bishop 5.1) Feed-forward Network Functions
- (Bishop 5.2) Network Training
- (Bishop 5.3) Error Backpropagation

- (Bishop 2.5) Nonparametric Methods

- (Bishop 6.1,6.2) Kernels
- (Bishop 7.1) Maximum Margin Classifiers
- Hearst 1998: High Level Presentation
- Burges 1998: Detailed Tutorial
- Optional Reading: Platt 1998: Training SVMs with Sequential Minimal Optimization

- (Mitchell Chapter 7) Computational Learning Theory

- (Bishop 8.1,8.2) Bayesian Networks, Conditional Independence

- (Bishop 8.4.1,8.4.2) Inference in Chain/Tree structures
- Rabiner's HMM tutorial

- Additional Reading: Heckerman BN Learning Tutorial
- Additional Reading: Tree-Augmented Naive Bayes paper

- (Bishop 9.1, 9.2) K-means, Mixtures of Gaussians

- (Bishop 9.3, 9.4) EM

- Blum and Mitchell co-training paper
- Optional reading: Joachims Transductive SVMs paper