------------------------------------------------------------------------------ Subject: HW3 and Proposal Due Dates Date: Mon, 03 Apr 2000 11:52:03 -0400 From: Chuck Rosenberg Newsgroups: cmu.cs.class.cs381 Due to a number of requests for HW3 extensions, we have decided to reduce the penalty for handing in HW3 late. To get full credit on the homework it must still be handed in at the beginning of class on Tuesday April 4th. After that, if the homework is handed in by the beginning of class on Thursday April 6th, then there will only be a 5% penalty. If it is handed in by 1:30pm Friday, there will be a 30% penalty. It will not be accepted after 1:30pm Friday. To avoid a conflict with the Mini Project proposal due date, you may now hand in the Mini Project proposal until 4:30 pm on Friday April 7th. But we still encourage you to hand it in in class on Thursday the 6th. The sooner that your proposal is approved, the sooner you can get hooked up with a TA project advisor, and get started on your project. If you are handing in an assignment or Mini Project Proposal on Friday, please leave it with Chuck Rosenberg in Wean Hall 7130. ------------------------------------------------------------------------------ Subject: Re: HW3 #2- Validation Set? Date: Mon, 03 Apr 2000 12:55:31 -0400 From: Chuck Rosenberg Newsgroups: cmu.cs.class.cs381 > I was looking in the notes and book for information on validation sets, > and couldn't find anything. It doesn't seem like this neural network uses > cross-validation... so something else is going on. > > As I gather from the assignment, there are three disjoint sets. We train > with the training set, and we test the results at each epoch with the test > set. But how is the validation set used, and how is it different from the > test set? The validation set is used for model selection. That is, at every epoch (iteration through the training set), the code evaluates the network performance on the validation set. It saves the network at the epoch with the lowest validation error to the ".net" file. Because the validation set was used to select which network we are going to use, we might expect that the error measured on the validation set might be biased on the low side. To get an unbiased estimate of the error of the network on data we have not yet seen, we report the performance on the test set. ------------------------------------------------------------------------------ Subject: Bayes Performance Date: Fri, 24 Mar 2000 12:16:57 -0500 From: Chuck Rosenberg Newsgroups: cmu.cs.class.cs381 This is the performance you should expect for Joint Bayes: Training set accuracy: 100.00% Validation set accuracy: 50.00% Test set accuracy: 62.50% This is the performance you should expect for Naive Bayes: Training set accuracy: 88.66% Validation set accuracy: 95.00% Test set accuracy: 87.50% ------------------------------------------------------------------------------ Subject: Re: nnets Date: Fri, 24 Mar 2000 12:15:39 -0500 From: Chuck Rosenberg Newsgroups: cmu.cs.class.cs381 > are we supposed to assume a perceptron or sigmoid model for the neural > networks portion (for 1b) Sigmoid model. ------------------------------------------------------------------------------ Subject: HW3 Coding Tip Date: Wed, 22 Mar 2000 10:47:45 -0500 From: Chuck Rosenberg Newsgroups: cmu.cs.class.cs381 Hi, Just a point about booleans (stored as integers) in C. If you want to compare two boolean values to check if they are equal, the following statement DOES NOT give the correct answer: if (x == y) ... ; The following statement is one possible way to get the correct answer: if ((x && y) || (!x && !y)) ... ; Chuck ------------------------------------------------------------------------------ Subject: HW3 Code Update - 3/20 Date: Mon, 20 Mar 2000 21:53:24 -0500 From: chuck+@cs.cmu.edu (Chuck Rosenberg) Newsgroups: cmu.cs.class.cs381 Hi, Another update to the file "util-bayes.c", the updated version can be found at: /afs/andrew.cmu.edu/scs/cs/15-381/381/public/util-bayes.c Also, the version in the tar file has been updated. I would highly recommend downloading the new version of util-bayes.c. You don't have to worry about the details of the change. But if you are curious, it is a small fix which will only affect some people's code. The issue is that the old version of GetHistogramCounts modified the target value of the example which would be a problem if you called GetExampleStringTargetValue after calling GetHistogramCounts. The new version fixes this problem so you don't have to worry about call order. Chuck ------------------------------------------------------------------------------ Subject: Re: HW3 Due Date Date: Mon, 20 Mar 2000 18:11:47 -0500 From: Bryan A Bailey Newsgroups: cmu.cs.class.cs381 The web page is correct. Since the lecture on neural nets was pushed back to this week, we're giving you an extra day before the early handin. Have it to NSH 4517 by 4:30 on Friday for the extra credit. Bryan ------------------------------------------------------------------------------ Subject: HW3 Hints Date: Mon, 20 Mar 2000 09:21:03 -0500 From: Chuck Rosenberg Newsgroups: cmu.cs.class.cs381 Here are two helpful hints for the Bayes Net section of the homework. 1) It is OK to assume that the prior probability of a mushroom being edible or poisonous is equal. That is: Pr(c=1) = Pr(c=0) = 0.5 2) Also, when predicting the edibility of a mushroom, remember that all you need to calculate is the relative probability that it is edible or poisonous, given its attribute values. That is, whether it is more probable that it is poisonous or edible. It is not necessary to know the actual posterior probability to make this determination. Chuck ------------------------------------------------------------------------------ Subject: HW3 Clarifications Date: Fri, 17 Mar 2000 15:56:14 -0500 From: Chuck Rosenberg Newsgroups: cmu.cs.class.cs381 ======= General ======= About breaking ties... For example if the number of yes's and no's in a decision tree leaf are equal (ie, num of yes = 5 and num of no = 5) or are both equal to zero, we would like your prediction to be 0, in other words "not edible". In this data set it seems like it is preferrable to make false negative errors instead of false positive errors. Just another note about the code update. (Which only applies to you if you downloaded the tar file before the update - the tar file has now been fixed.) I was concerned that a lot of people said they had not yet downloaded the code update. Please download the updated file: /afs/andrew.cmu.edu/scs/cs/15-381/381/public/util-bayes.c With the old version of the file your code in student-bayes.c will not work, even if your implementation is correct. ====================== Neural Network Section ====================== It seems like some clarification of the output of the "nettrain" program is necessary. At the end of training the equivalnce network, nettrain will output something like this: (because of the -F option) Full results for data set "SSV": [0.100 0.100] => 0.894 (t= 0.900) (mse= 0.000) [0.100 0.900] => 0.110 (t= 0.100) (mse= 0.000) [0.900 0.100] => 0.110 (t= 0.100) (mse= 0.000) [0.900 0.900] => 0.884 (t= 0.900) (mse= 0.000) The values in the "[]"'s are the inputs to the network. The value after the "=>" is the actual output of the network on those inputs. The value in the parentheses "(t= 0.900)" is the target value, the value the network was trying to achieve for that set of inputs. You will notice that instead of 0's and 1's the values are 0.1 and 0.9. For practical implementation reasons, we usually replace the 0's with something close to 0, like 0.1 and the 1's with something close to 1, like 0.9. Towards the beginning of training the mushroom network, nettrain will output something like this: |-----|---------|-------|---------|-------|---------|-------|---------| |Epoch| Wt Mag |Trn Acc| Trn Mse |Val Acc| Val Mse |Tst Acc| Tst Mse | |-----|---------|-------|---------|-------|---------|-------|---------| 0 0.051154 0.000 0.159280 0.000 0.159132 0.000 0.162530 10 0.077127 0.000 0.128372 0.000 0.128430 0.000 0.119667 20 0.134372 70.103 0.047903 65.000 0.051911 66.667 0.076003 30 0.168410 83.505 0.028347 82.500 0.037533 75.000 0.068631 The columns have the following meanings: Epoch - number of passes completed through the training set Wt Mag - the average magnitude of the weights in the network Trn Acc - network accuracy on the training set Trn Mse - network mean squared error on the training set Val Acc - network accuracy on the validation set Val Mse - network mean squared error on the validation set Tst Acc - network accuracy on the test set Tst Mse - network mean squared error on the test set ===================== Decision Tree Section ===================== You may notice that when you select the "-c" option of the dt program, the yes/no counts at the nodes change. The reason is that we decided to use a smoothing prior when calculating the probabilities at the leaves. This change should not affect your answers. In problems 5b and 5c, the goal is for you to list a set of attributes and their specific values for a mushroom. The way to think about this is, is that if you saw a mushroom with those attribute values, you would be either least (or most) confident in the probability estimates of whether the mushroom is edible or not. So for 5b you should end up with two different sets of atttributes and their values. And for 5c you should also end up with two different sets of atttributes and their values. ===================== Bayes Network Section ===================== In the function GetHistogramCounts in the file util-bayes.c: void GetHistogramCounts (char *example_str, int example_index, int *num_pos_ptr, int *num_neg_ptr) { ... } The value returned via num_pos_ptr is the count of the number of examples entered into the histogram whose target value is 1 and whose attribute(s) values match the attribute values in example_str. The value returned via num_neg_ptr is the count of the number of examples entered into the histogram whose target value is 0 and whose attribute(s) match example_str. The beginning of the comments about the function BuildJointHistogram in the file student-bayes.c should read: The final histogram will be a hash table which contains a count of how many times a specific set of attribute values ocurred IN CONJUNCTION WITH A SPECIFIC TARGET VALUE in the data set. ------------------------------------------------------------------------------ Subject: HW3 Code Update Date: Thu, 16 Mar 2000 13:05:53 -0500 From: Chuck Rosenberg Newsgroups: cmu.cs.class.cs381 Hi, There was a small problem with one of the code files. If you've already downloaded the tar file, you can get a copy of just the updated file here: /afs/andrew.cmu.edu/scs/cs/15-381/381/public/util-bayes.c If you haven't already downloaded the code, the tar file has been updated with the new version of the file and you can ignore this message. Chuck ------------------------------------------------------------------------------ Subject: HW3 Note Date: 16 Mar 2000 10:35:17 -0500 From: chuck+@cs.cmu.edu (Chuck Rosenberg) Newsgroups: cmu.cs.class.cs381 Hi, Note that each of the sections of homework 3 - neural nets, decision trees, and bayes nets - are independent of one another. So you can work on and complete any of the three sections without needing anything from the other two. This is important because we have not yet covered neural networks in class and Professor Cohn is swapping the neural network lecture to next week. Chuck ------------------------------------------------------------------------------