Regularized Logistic Regression

Introduction

Welcome to the programming component of this assignment!

This assignment includes an autograder for you to grade your answers on your machine. This can be run with the command:

python3.6 autograder.py

The code for this assignment consists of several Python files, some of which you will need to read and understand in order to complete the assignment, and some of which you can ignore. You can download and unzip all the code, data, and supporting files from hw4_programming.zip.

Files you will edit

logistic_regression.py Your code to implement regularized logistic regression tasks.
additional_code.py You shouldn't need this file for this assignment, but it is provided just in case you have additional code that doesn't fit into logistic_regression.py for some reason. If you do submit this file the code should be runnable by calling python3.6 additional_code.py, but there are no requirements on the format and it will not be executed by the autograder.

Files you might want to look at

util.py Convenience methods to generate various plots that will be needed in this assignment.
test_cases/Q*/*.py These are the unit tests that the autograder runs. Ideally, you would be writing these unit tests yourself, but we are saving you a bit of time and allowing the autograder to check these things. You should definitely be looking at these to see what is and is not being tested. The autograder on Gradescope may run a different version of these unit tests.

Files you can safely ignore

autograder.py Autograder infrastructure code.

Files to Edit and Submit: You will fill in portions of logistic_regression.py during the assignment. You should submit this file containing your code and comments to the Programming component on Gradescope. Please do not change the other files in this distribution or submit any of our original files other than these files. Please do not change the names of any provided functions or classes within the code, or you will wreak havoc on the autograder.

Report: Many of the sections in this programming assignment will contain questions that are not autograded. You will place the requested results in the appropriate locations within the PDF of the Written component of this assignment.

Evaluation: Your assignment will be assessed based on your code, the output of the autograder, and the required contents of in the Written component.

Academic Dishonesty: We will be checking your code against other submissions in the class for logical redundancy. If you copy someone else's code and submit it with minor changes, we will know. These cheat detectors are quite hard to fool, so please don't try. We trust you all to submit your own work only; please don't let us down. If you do, we will pursue the strongest consequences available to us.

Getting Help: You are not alone! If you find yourself stuck on something, contact the course staff for help. Office hours, recitation, and Piazza are there for your support; please use them. If you can't make our office hours, let us know and we will schedule more. We want these assignments to be rewarding and instructional, not frustrating and demoralizing. But, we don't know when or how to help unless you ask.

Regularization in Logistic Regression

See the written component of the assignment for a description of the problem setup as well as the objective function and Netwon's method update equation.

Question 1: Objective Function

In logistic_regression.py, implement the objective function to compute the value of the objective for L2-regularized logistic regression. See function docstring for details.

You may run the following command to run a quick unit test on your Q1 implementation:

python3.6 autograder.py -q Q1

We encourage you to write your own code to test out your implementation as you work through the assignment. For example, you may want to use some of the functions in util.py to plot the data that you just loaded.

The autograder will also the contours of the objective function for four different values of lambda. It will save these plots as object_lambda_X.png in a new directory named figures. You are required to include these figures as part of the written component of this assignment.

Question for the write-up: What should be the labels for the horizontal and vertical axes of the objective plot?

Question for the write-up: Describe the effect of different lambda values on the objective function.

Question 2: Gradient Descent

In the gradient_descent function in logistic_regression.py, implement gradient descent for L2-regularized logistic regression. See function docstring for details.

You may run the following command to run a quick unit test on your Q2 implementation:

python3.6 autograder.py -q Q2

The autograder will also plot the gradient descent convergence for three different learning rates. It will save this plot as gradient_descent.png in a new directory named figures. You are required to include this figure as part of the written component of this assignment.

Question for the write-up: Describe the effect of learning rate on gradient descent convergence for this problem.

Question 3: Newton's Method

In logistic_regression.py, implement the objective function to compute the value of the objective for L2-regularized logistic regression. See function docstring for details.

You may run the following command to run a quick unit test on your Q3 implementation:

python3.6 autograder.py -q Q3

We encourage you to write your own code to test out your implementation as you work through the assignment. For example, you may want to use some of the functions in util.py to plot the data that you just loaded.

The autograder will also plot the Newton's method and gradient descent convergence for two different lambda values. It will save thes plots as newtons_method_lambda_X.png in a new directory named figures. You are required to include these figures as part of the written component of this assignment.

Question for the write-up: Newton's method, being a quadratic approximation method, is clearly problematic for some lambda values but not for others. Describe the effect of different lambda values on Netwon's method convergence for this problem.

Submission

Complete all questions as specified in the above instructions. Then upload logistic_regression.py (and additional_code.py if you used it) to Gradescope. Your submission should finish running within 10 minutes, after which it will time out on Gradescope.

Don't forget to include any request results in the PDF of the Written component, which is to be submitted on Gradescope as well.

You may submit to Gradescope as many times as you like. You may also run the autograder on your own machine to speed up the development process. Just note that the autograder on Gradescope will be slightly different than the local autograder. The autograder can be invoked on your own machine using the command:

python3.6 autograder.py

Note that running the autograder locally will not register your grades with us. Remember to submit your code when you want to register your grades for this assignment.

The autograder on Gradescope might take a while but don't worry: so long as you submit before the deadline, it's not late.