Welcome to the programming component of this assignment!
This assignment includes an autograder for you to grade your answers on your machine. This can be run with the command:
python3.6 autograder.py
The code for this assignment consists of several Python files, some of which you will need to read and understand in order to complete the assignment, and some of which you can ignore. You can download and unzip all the code, data, and supporting files from hw2_programming.zip.
regression.py |
Your code to implement regression tasks. |
classification.py |
Your code to impelement handwritten digit classification tasks. |
additional_code.py |
Add additional code that you will need to write to answer various questions will go here. This code should be runnable by calling python3.6 additional_code.py , but there are no requirements on the format and it will not be executed by the autograder. |
util.py |
Convenience methods to generate various plots that will be needed in this assignment. |
test_cases/Q*/*.py |
These are the unit tests that the autograder runs. Ideally, you would be writing these unit tests yourself, but we are saving you a bit of time and allowing the autograder to check these things. You should definitely be looking at these to see what is and is not being tested. The autograder on Gradescope may run a different version of these unit tests. |
autograder.py |
Autograder infrastructure code. |
Files to Edit and Submit: You will fill in portions of regression.py
, classification.py
, and additional_code.py
during the assignment. You should submit these files containing your code and comments to the Programming component on Gradescope. Please do not change the other files in this distribution or submit any of our original files other than these files. Please do not change the names of any provided functions or classes within the code, or you will wreak havoc on the autograder.
Report: Many of the sections in this programming assignment will contain questions that are not autograded. You will place the requested results in the appropriate locations within the PDF of the Written component of this assignment.
Evaluation: Your assignment will be assessed based on your code, the output of the autograder, and the required contents of in the Written component.
Academic Dishonesty: We will be checking your code against other submissions in the class for logical redundancy. If you copy someone else's code and submit it with minor changes, we will know. These cheat detectors are quite hard to fool, so please don't try. We trust you all to submit your own work only; please don't let us down. If you do, we will pursue the strongest consequences available to us.
Getting Help: You are not alone! If you find yourself stuck on something, contact the course staff for help. Office hours, recitation, and Piazza are there for your support; please use them. If you can't make our office hours, let us know and we will schedule more. We want these assignments to be rewarding and instructional, not frustrating and demoralizing. But, we don't know when or how to help unless you ask.
Install PyTorch and look through some of the tutorials. Specifically, take a look at the What is PyTorch, Neural Networks, and Training a Classifier sections within the 60 Minute Blitz tutorial.
You should be able to install PyTorch on the unix.andrew.cmu.edu
machines by adding the --user
option to pip3 install
:
pip3 install --user torch==1.4.0+cpu torchvision==0.5.0+cpu -f https://download.pytorch.org/whl/torch_stable.html
You will not need the CUDA (GPU) option. The autograder on Gradescope will not be using it.
You can verify that your installation is successful by running the following Python3.6 code:
import torch x = torch.rand(5, 3) print(x)
Make sure you ask for help if you are having issues installing PyTorch, as you won't be able to complete this assignment without it.
In regression.py
, implement the load_and_split_data
function to load our regression data and split it into training and validation sets. See function docstring for details.
You are required to use the Python NumPy library to implement code in this and many other questions in this course. If you are not familiar with NumPy, please take some time to walk through an online tutorial, such as https://docs.scipy.org/doc/numpy/user/quickstart.html.
You may run the following command to run a quick unit test on your Q1 implementaion:
python3.6 autograder.py -q Q1
We encourage you to write your own code to test out your implementation as you work through the assignment. For example, you may want to use some of the functions in util.py
to plot the data that you just loaded.
Question for the write-up: Why didn't we give you a test set?
In regression.py
, implement the setup_design_matrix
and linear_closed_form_fit
functions to calculate the closed form solution to this linear least squares problem.
We place the input training vector, \(\boldsymbol{x} \in \mathbb{R}^{N\times 1}\), into a design matrix, \(X \in \mathbb{R}^{N\times 2}\), so that we can account for a bias term in our linear model. We do this by inserting a column of ones in the first column of \(X\). This also means that the weight is now a weight vector, \(\boldsymbol{w} = [b, w]^T\).
Using the design matrix and the output training data, \(\boldsymbol{y}\), we can then directly solve for the optimal weight vector, \(\boldsymbol{w}^*\): $$\boldsymbol{w}^* = (\boldsymbol{X}^T\boldsymbol{X})^{-1}\boldsymbol{X}^T\boldsymbol{y}$$
We require that you use numpy operations, rather than for loops, to implement this closed form solution. You may use numpy.linalg.inv
, but you may NOT use numpy solvers, such as numpy.linalg.solve
or numpy.linalg.lstsq
.
You may run the following command to run a quick unit test on your implementaion:
python3.6 autograder.py -q Q2
The autograder will also plot data and the hypothesis function line and save the plot as regression_closed_form.png
in a new directory named figures
. You are required to include this figure as part of the written component of this assignment.
Question for the write-up: What is the mean squared error on the training set? (Don't divide by 2.)
Question for the write-up: What is the mean squared error on the validation set? (Don't divide by 2.) You will need to write additional code to answer this.
Additional code: Any additional code that you write to answer these questions should be included in additional_code.py
So far, we have been using NumPy; this is where we transition to PyTorch. PyTorch is designed for building neural networks, but in this question we are going to leverage PyTorch to do stochastic gradient descent on our 1-D linear regression model.
See PyTorch section to make sure you have PyTorch installed and take some time to work through some PyTorch tutorials.
In regression.py
, implement the LinearRegressionNet.__init__
and LinearRegressionNet.forward
methods. Follow https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#define-a-convolutional-neural-network as a template, but instead use only one torch.nn.Linear
layer with one input and one output. No need for convolution, ReLU, or pool layers. (The x.view
is not necessary either.)
The code to setup the loss function, the SGD optimization algorithm, and actually run the trainning is provided for you in train_linear_regression_net
. You should become familiar with this code as you will need to implement a similar version later in this assignment.
You may run the following command to run a unit test on your implementaion:
python3.6 autograder.py -q Q3
The autograder will also plot data and the hypothesis function line and save the plot as regression_sgd.png
in the figures
directory. You are required to include this figure as part of the written component of this assignment.
Neural networks for 1-D regression!1>
In regression.py
, implement the RegressionNeuralNet.__init__
and RegressionNeuralNet.forward
methods. Follow https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#define-a-convolutional-neural-network as a template, but instead of convolution and pool layers, use as many torch.nn.Linear
and torch.nn.functional.relu
layers as you like. You can also set the numer of outputs for each linear layer however you like, with the exception of the last one, which should have just one output. Make sure that the number of inputs argument to torch.nn.Linear
is the same as the number of outputs in the previous layer.
The goal in this question is to design a neural network that has even better mean squared error on the training set than the closed form solution (Q2) or the linear SGD solution (Q3).
Similar to Q3, the code to setup the loss function, the SGD optimization algorithm, and actually run the trainning is provided for you in train_regression_neural_net
.
You may run the following command to run a unit test on your implementaion:
python3.6 autograder.py -q Q4
The autograder will also plot data and the hypothesis function line and save the plot as regression_net.png
in the figures
directory. You are required to include this figure as part of the written component of this assignment.
Question for the write-up: What is the mean squared error on the training set? (Don't divide by 2.)
Question for the write-up: What is the mean squared error on the validation set? (Don't divide by 2.) You will need to write additional code to answer this.
For the write-up, it is ok if these numbers come from a different training run than the autograder.
Additional code: Any additional code that you write to answer these questions should be included in additional_code.py
In classification.py
, implement the load_and_split_data
function to load handwritten digit data from the MNIST dataset and split it into training and validation sets. See function docstring for details.
You may run the following command to run a quick unit test on your Q5 implementaion:
python3.6 autograder.py -q Q5
In classification.py
, implement the following functions to formulate this classification problem as a linear regresssion problem and solve using the closed form solution:
setup_design_matrix
setup_onehot_label_matrix
linear_closed_form_fit
predict_labels_from_regression
compute_accuracy
confusion_matrix
You will have to rely on work from Q2 on the written component of this assignment for how to formulate this as a linear least squares problem and solve.
We require that you use numpy operations, rather than for loops, to implement this closed form solution. You may use numpy.linalg.pinv
, but you may NOT use numpy solvers, such as numpy.linalg.solve
or numpy.linalg.lstsq
.
You may run the following command to run a quick unit test on your implementaion:
python3.6 autograder.py -q Q6
Question for the write-up: How many times is an eight in the training set incorrectly labelled as a nine?
Question for the write-up: What is the accuracy on the training set?
Question for the write-up: What is the accuracy on the validation set? You will need to write additional code to answer this.
Additional code: Any additional code that you write to answer these questions should be included in additional_code.py
Neural networks for digit classification!1>
In classification.py
, implement the following methods and functions:
DigitNet.__init__
DigitNet.forward
train_neural_net
predict_labels_from_network
In this question, you will implement the following specific neural network in the DigitNet class:
$$Input_{784} \rightarrow Linear_{50} \rightarrow ReLU \rightarrow Linear_{50} \rightarrow ReLU \rightarrow Linear_{10}$$ $$\text{where the }N \text{ in } Linear_N \text{ is the number of output values for that linear function.}$$In the classification.train_neural_net
function, you'll have to provide the code to setup the PyTorch data loader, loss function, SGD optimization, and for loops to train the data. This code is a lot like the code we provided for you in Q3 and Q4. See the docstring in the code for more details on exactly which settings you may use for batch size, learning rate, number of iterations, etc.
Again, it will be helpful to follow https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#define-a-convolutional-neural-network as a template, but don't use convolution and pool layers, just use torch.nn.Linear
and torch.nn.functional.relu
layers. Make sure that the number of inputs argument to torch.nn.Linear
is the same as the number of outputs in the previous layer.
You may run the following command to run a unit test on your implementaion:
python3.6 autograder.py -q Q7
Question for the write-up: What is the accuracy on the training set?
Question for the write-up: What is the accuracy on the validation set? You will need to write additional code to answer this.
For the write-up, it is ok if these numbers come from a different training run than the autograder.
Additional code: Any additional code that you write to answer these questions should be included in additional_code.py
Complete all questions as specified in the above instructions. Then upload regression.py
, classification.py
, and additional_code.py
to Gradescope. Your submission should finish running within 20 minutes, after which it will time out on Gradescope.
Don't forget to include any request results in the PDF of the Written component, which is to be submitted on Gradescope as well.
You may submit to Gradescope as many times as you like. You may also run the autograder on your own machine to speed up the development process. Just note that the autograder on Gradescope will be slightly different than the local autograder. The autograder can be invoked on your own machine using the command:
python3.6 autograder.py
Note that running the autograder locally will not register your grades with us. Remember to submit your code when you want to register your grades for this assignment.
The autograder on Gradescope might take a while but don't worry: so long as you submit before the deadline, it's not late.