10-601, Spring 2009 Tom Mitchell Machine Learning Department, School of Computer Science, Carnegie-Mellon University |
Homework 1. Decision Trees. Out Jan 12, Due Jan 21
The written assignment
is here (updated noon Jan 13 with final data description)The data for this assignment describes the voting records of members of the US House of Representatives. It contains three independent sets of examples:
(data was updated on Sat Jan 17, 7pm.)
Data sets here are in
comma-separated-value format (for example, see the PlayTennis data below).
This first line of the file gives the names of the attributes separated by commas.
Each remaining line describes one example, with the values of
each attribute separated by a comma. Note you can load these
files into Microsoft Excel to reformat them if you wish.
EXTRA OPTIONAL DATA SET:
Here is an additional data set you might like to use to debug your program, since you know from the class slides what the correct tree should look like. It is the PlayTennis data in the slides from Jan 12, 2009:
playTennis,outlook,temperature,humidity,wind
no,sunny,hot,high,weak
no,sunny,hot,high,strong
yes,overcast,hot,high,weak
yes,rain,mild,high,weak
yes,rain,cool,normal,weak
no,rain,cool,normal,strong
yes,overcast,cool,normal,strong
no,sunny,mild,high,weak
yes,sunny,cool,normal,weak
yes,rain,mild,normal,weak
yes,sunny,mild,normal,strong
yes,overcast,mild,high,strong
yes,overcast,hot,normal,weak
no,rain,mild,high,strong
© 2009 Tom Mitchell @ School of
Computer Science, Carnegie Mellon University
[validate xhtml]
[validate xhtml]