Cross-Validation

Tutorial Slides by Andrew Moore

Cross-validation is one of several approaches to estimating how well the model you've just learned from some training data is going to perform on future as-yet-unseen data. We'll review testset validation, leave-one-one cross validation (LOOCV) and k-fold cross-validation, and we'll discuss a wide variety of places that these techniques can be used. We'll also discuss overfitting...the terrible phenomenon that CV is supposed to present. And at the end, our hairs will stand on end as we realize that even when using CV, you can still overfit arbitrarily badly.

Download Tutorial Slides (PDF format)

Powerpoint Format: The Powerpoint originals of these slides are freely available to anyone who wishes to use them for their own work, or who wishes to teach using them in an academic institution. Please email Andrew Moore at awm@cs.cmu.edu if you would like him to send them to you. The only restriction is that they are not freely available for use as teaching materials in classes or tutorials outside degree-granting academic institutions.

Advertisment: I have recently joined Google, and am starting up the new Google Pittsburgh office on CMU's campus. We are hiring creative computer scientists who love programming, and Machine Learning is one the focus areas of the office. If you might be interested, feel welcome to send me email: awm@google.com .