Homework 1.  Decision Trees.  Out Jan 12,   Due Jan 21

The written assignment is here (updated noon Jan 13 with final data description)

The data for this assignment describes the voting records of members of the US House of Representatives.  It contains three independent sets of examples:

(data was updated on Sat Jan 17, 7pm.)
DATA FORMAT:
Data sets here are in comma-separated-value format (for example, see the PlayTennis data below).  This first line of the file gives the names of the attributes separated by commas.  Each remaining line describes one example, with the values of each attribute separated by a comma.  Note you can load these files into Microsoft Excel to reformat them if you wish.


EXTRA OPTIONAL DATA SET:

Here is an additional data set you might like to use to debug your program, since you know from the class slides what the correct tree should look like.  It is the PlayTennis data in the slides from Jan 12, 2009:

playTennis,outlook,temperature,humidity,wind
no,sunny,hot,high,weak
no,sunny,hot,high,strong
yes,overcast,hot,high,weak
yes,rain,mild,high,weak
yes,rain,cool,normal,weak
no,rain,cool,normal,strong
yes,overcast,cool,normal,strong
no,sunny,mild,high,weak
yes,sunny,cool,normal,weak
yes,rain,mild,normal,weak
yes,sunny,mild,normal,strong
yes,overcast,mild,high,strong
yes,overcast,hot,normal,weak
no,rain,mild,high,strong



© 2009 Tom Mitchell @ School of Computer Science, Carnegie Mellon University
[validate xhtml]