10-601B Machine Learning, Fall 2015

Course Information

  • Instructor: Seyoung Kim (Computational Biology Department & School of Computer Science, Carnegie Mellon University)
  • Location: DH 2210
  • Time: Tuesday & Thursday, 9:00-10:20am
  • Office hours: Thursday 10:30-11:30am, GHC 7721
  • Recommended Textbooks:
    • Machine Learning, Tom Mitchell. (optional)
    • Pattern Recognition and Machine Learning, Christopher Bishop. (optional)
    • The Elements of Statistical Learning: Data Mining, Inference and Prediction, Trevor Hastie, Robert Tibshirani, Jerome Friedman. [online book] (optional)
  • Recitation: Thursday 7:30-8:30pm, Porter Hall 100
  • Teaching assistants (email):
    • Aman Gupta (amang at andrew)
    • Namhoon Lee (namhoonl at andrew)
    • Joseph Runde (jrunde at andrew)
    • Pengcheng Zhou (zhoupc1988 at gmail)
    • Pengcheng Xu (pengchex at andrew)
    • Zhenzhen Weng (zweng at andrew)
    • Calvin McCarter (cmccarte at andrew)
    • Udbhav Prasad (udbhavp at andrew)
  • TA Office hours:
    • 5-6pm Monday GHC 4102
    • 11am-12pm Tuesday GHC 7605
    • 5-6pm Wednesday GHC 4301
    • 6:30-7:30pm Thursday Porter 100
  • Grading: Homework (60%), project (20%), late mid-term (20%)
  • Late homework policy: All homeworks and projects are due by 10:20am on the due date via autolab. Each student will given two late days that can be spent on any homeworks but not on projects. Without late days, one-day late homework submission will receive 50% of the grade, and 0% afterwards.

Course Description

Machine Learning (ML) develops computer programs that automatically improve their performance through experience. This includes learning many types of tasks based on many types of experience, e.g. spotting high-risk medical patients, recognizing speech, classifying text documents, detecting credit card fraud, or driving autonomous vehicles. 10601 covers all or most of: concept learning, decision trees, neural networks, linear learning, active learning, estimation & the bias-variance tradeoff, hypothesis testing, Bayesian learning, the MDL principle, the Gibbs classifier, Naive Bayes, Bayes Nets & Graphical Models, the EM algorithm, Hidden Markov Models, K-Nearest-Neighbors and nonparametric learning, reinforcement learning, bagging, boosting and discriminative training. Grading will be based on weekly or biweekly assignments (written and/or programming), a midterm, a final exam, and possibly a project (details may vary depending on the section). 10601 is recommended for CS Seniors & Juniors, quantitative Masters students, & non-MLD PhD students. Prerequisites (strictly enforced): strong quantitative aptitude, college prob&stats course, and programming proficiency. For learning to apply ML practically & effectively, without the above prerequisites, consider 11344/05834 instead. You can evaluate your ability to take the course via a self-assessment exam at: http://www.cs.cmu.edu/~aarti/Class/10701_Spring14/Intro_ML_Self_Evaluation.pdf

Policy on Collaboration

These policies are the same as were used in Dr. Rosenfeld's previous version of 2013, which was also used in Drs. Cohen & Xing's class.

The purpose of student collaboration is to facilitate learning, not to circumvent it. Studying the material in groups is strongly encouraged. It is also allowed to seek help from other students in understanding the material needed to solve a particular homework problem, provided no written notes are shared, or are taken at that time, and provided learning is facilitated, not circumvented. The actual solution must be done by each student alone, and the student should be ready to reproduce their solution upon request.

The presence or absence of any form of help or collaboration, whether given or received, must be explicitly stated and disclosed in full by all involved, on the first page of their assignment. Specifically, each assignment solution must start by answering the following questions:
(1) Did you receive any help whatsoever from anyone in solving this assignment? Yes / No. If you answered 'yes', give full details: _______________ (e.g. "Jane explained to me what is asked in Question 3.4")

(2) Did you give any help whatsoever to anyone in solving this assignment? Yes / No. If you answered 'yes', give full details: _______________ (e.g. "I pointed Joe to section 2.3 to help him with Question 2". Collaboration without full disclosure will be handled severely, in compliance with CMU's Policy on Cheating and Plagiarism.

As a related point, some of the homework assignments used in this class may have been used in prior versions of this class, or in classes at other institutions. Avoiding the use of heavily tested assignments will detract from the main purpose of these assignments, which is to reinforce the material and stimulate thinking. Because some of these assignments may have been used before, solutions to them may be (or may have been) available online, or from other people. It is explicitly forbidden to use any such sources, or to consult people who have solved these problems before. You must solve the homework assignments completely on your own. I will mostly rely on your wisdom and honor to follow this rule, but if a violation is detected it will be dealt with harshly. Collaboration with other students who are currently taking the class is allowed, but only under the conditions stated above.