Statistical Data Mining Tutorials

Tutorial Slides by Andrew Moore

Advertisment: In 2006 I joined Google. We are growing a Google Pittsburgh office on CMU's campus. We are hiring creative computer scientists who love programming, and Machine Learning is one the focus areas of the office. We're also currently accepting resumes for Fall 2008 intenships. If you might be interested, feel welcome to send me email: awm@google.com .

The following links point to a set of tutorials on many aspects of statistical data mining, including the foundations of probability, the foundations of statistical data analysis, and most of the classic machine learning and data mining algorithms.

These include classification algorithms such as decision trees, neural nets, Bayesian classifiers, Support Vector Machines and cased-based (aka non-parametric) learning. They include regression algorithms such as multivariate polynomial regression, MARS, Locally Weighted Regression, GMDH and neural nets. And they include other data mining operations such as clustering (mixture models, k-means and hierarchical), Bayesian networks and Reinforcement Learning.

I hope they're useful (and please let me know if they are, or if you have suggestions or error-corrections). Click here for a short list of topics.