Machine Learning

Tom Mitchell and Ziv Bar-Joseph

Sep 4 Intro to ML
Decision Trees
Slides

• Machine learning examples
• Well defined machine learning problem
• Decision tree learning
Mitchell: Ch 3
Bishop: Ch 14.4
The Discipline of Machine Learning
HW1 out
Sep 6 Decision Tree learning

Review of Probability

slides
• The big picture
• Overfitting
• Random variables, probabilities
Andrew Moore's Basic Probability Tutorial
Bishop: Ch. 1 thru 1.2.3
Bishop: Ch 2 thru 2.2
Sep 11
Probability and Estimation

slides
annotated slides
• Bayes rule
• MLE
• MAP
Andrew Moore's Basic Probability Tutorial
Bishop: Ch. 1 thru 1.2.3
Bishop: Ch 2 thru 2.2

Tom's VIDEO: Probability and Estimation

Sep 13
• Conditional independence
• Naive Bayes
Mitchell: Naive Bayes and Logistic Regression

Tom's VIDEO: Naive Bayes

Sep 18
Gaussian Naive Bayes
Slides
Annotated slides

• Gaussian Bayes classifiers
• Document classification
• Brain image classification
• Form of decision surfaces
Mitchell: Naive Bayes and Logistic Regression

Tom's VIDEO: Gaussian Bayes

Sep 20
Logistic Regression

Slides
Annotated slides
• Naive Bayes - the big picture
• Logistic Regression: Maximizing conditional likelihood
• Gradient ascent as a general learning/optimization method
Mitchell: Naive Bayes and Logistic Regression

Optional: Ng & Jordan: On Discriminative and Generative Classifiers, NIPS, 2001.

Tom's VIDEO: Logistic regression

Sep 25
Linear Regression
Slides
Annotated slides
• Generative/Discriminative models
• minimizing squared error and maximizing data likelihood
• regularization
• bias-variance decomposition
Bishop: Ch. 1 thru 1.2.5, Ch. 3 thru 3.2
Optional: Mitchell: Ch. 6.4

Tom's VIDEO: Linear regression

Sep 27
Neural Networks
Slides
• Non-linear regression
• Learning of representations
• Deep Belief Networks
Mitchell: Ch. 4, or Bishop: Ch. 5
Optional: Le et al., 2012

Tom's VIDEO: Neural networks

Oct 2
Graphical models 1
Slides
Annotated slides
• Bayes nets
• representing joint distributions with conditional independence assumptions

Oct 4
Graphical models 2
Slides
Annotated slides
• Inference
• Learning from fully observed data
• Learning from partially observed data
Intro. to Graphical Models, K. Murphy

Tom's VIDEO: Graphical models 2
Oct 9
Graphical models 3

Annotated slides
• EM
• Semi-supervised learning
• Mixture of Gaussian clustering
• K-Means clustering
Bishop: Ch. 9 through 9.2

Optional: EM and HMM tutorial J.Bilmes (sec. 1-3)

Tom's VIDEO: Graphical models 3
Tom's VIDEO: Graphical models 4
Oct 11
PAC Learning I
Slides
Annotated slides
• Computational Learning Theory
• Probably approximately correct learning
Mitchell: Ch. 7

Tom's VIDEO: Learning theory 1
Oct 16
PAC Learning II
Annotated Slides
• VC Dimension
• Agnostic learning models
• Mistake bound models
Mitchell: Ch. 7

Tom's VIDEO: Learning theory 2
Oct 18 Midterm
Oct 23
Hierarchical Clustering
Slides
• Distance functions
• Hierarchical clustering
• Number of clusters
Bishop: 9-9.2
Optional: Tutorial on clustering
Hierarchical clustering app
Oct 25
Semi-Supervised Learning
Slides
• Semi-supervised learning
• Re-weighting labeled examples
• CoTraining
• Detecting overfitting
HW 4 out
HW 4 data
Oct 30
Bayesian Networks
Slides
• Graphical models
• Constructing a BN
• Inference in BNs
• Why its hard
• Variable elimination
• Stochastic inference
Chap 8.1 and 8.2.2 (Bishop)
Optional: Tutorials: 1 2
Nov 1
Inference in Bayesian Networks
Slides
• Why its hard
• Variable elimination
• Stochastic inference
• Introduction to HMMs
Chap 8.1 and 8.2.2 (Bishop)
Optional: Tutorials: 1 2
Nov 6
Inference in Hidden Markov models
Slides
• Formal definition of HMMs
• Inference in HMMs
• With no observations
• With observations
• The Viterbi algorithm
Bishop - 13-13.2.1 (inclusive)
Tutorial Tutorial
Nov 8
Learning in HMMs
Slides
• Learning parameters when states can be observed
• Fully unsupervised
• Forward backward algorithm
• EM for HMM learning
• Introduction to Markov decision processes (MDPs)
Bishop - 13.2.1-13.2.2
Tutorial Tutorial
Nov 13
Markov Decision Processes (MDP)
Slides
• Formal definition of MDPs
• Inference in MDPs
• With no actions
• With actions
Tutorial
Demo
Nov 15
HMMs in Biology
Slides
Nov 20
Dimensionality reduction (PCA)
Notes
Chapter 4.1.4 - 4.1.6 in Bishop
Tutorial for PCA including MATLAB code.
Nov 27
Suport Vector Machine (SVM)
Slides
• Max margin
• Support vectors
• Linear separation
Nov 29
Suport Vector Machine (SVM)
Slides
• Lagrange multiplies
• Dual formulation of SVM
• Transformation of the input vector
• The kernel trick
Software