Your Course Project
Your class project is an opportunity for you to explore an interesting optimization problem of your choice. Your project may involve solving an optimization problem in the context of some problem in machine learning, AI or other domain of your interest, or to implement and evaluate core optimization techniques. All projects must have an implementation component, though theoretical aspects may also be explored. You should also evaluate your approach, preferably on realworld data, though for some projects simulated data may also be appropriate. Below, you will find some project ideas, but the best idea would be to combine optimization with problems in your own research area. Your class project must be about new things you have done this semester; you can't use results you have developed in previous semesters. If you are uncertain about this requirement, please email the instructors.
Projects can be done by you as an individual, or in teams of two students. Each project will also be assigned a 725 instructor as a project consultant/mentor. They will consult with you on your ideas, but of course the final responsibility to define and execute an interesting piece of work is yours. Your project will be worth 30% of your final class grade, and will have two final deliverables:

a writeup in the format of a NIPS paper (8 pages maximum in NIPS format, including references; this page limit is strict), due May 5th by 3pm by email to the instructors list, worth 60% of the project grade, and

a poster presenting your work for a special class poster session on May 1st, 36pm in the NSH Atrium, worth 20% of the project grade.
In addition, you must turn in a midway progress report (5 pages maximum in NIPS format, including references) describing the results of your first experiments by April 9th in class, worth 20% of the project grade. Note that, as with any conference, the page limits are strict! Papers over the limit will not be considered.
Project Proposal
You must
turn in a brief project proposal (1page maximum) by
March 5th in class.
Read the list of potential
project ideas below. You are encouraged to use one of these
ideas. If you prefer to do a different project
and you are proposing your own
data set you must have
access to this data already, and present a clear proposal for what you
would do with it.
Project proposal format: Proposals should be one page maximum. Include the following information:

Project title

Project idea. This should be approximately two paragraphs.

Data set you will use.

Software you will need to write.

Papers to read. Include 13 relevant papers. You will probably want to read at least one of them before submitting your proposal.

Teammate: will you have a teammate? If so, whom? Maximum team size is two students.

April 9 milestone: What will you complete by April 9? Experimental results of some kind are expected here.
Project
suggestions: Below
are descriptions of some suggested
project ideas.
You are encouraged to select and flesh out one of
these projects, or make up you own wellspecified project. If you have other
project ideas you would like to work on, we would consider that as well,
provided you get approved by the course instructors.
Suggested Projects
Project 1: Implement a noregret learner for some interesting problem  e.g., a structured classification problem, or the problem of finding a tour in a graph, or the problem of finding an extensiveform correlated equilibrium in a game tree with incomplete information. Some interesting noregret algorithms include follow the perturbed leader (Kalai & Vempala) or Lagrangian hedging (http://www.cs.cmu.edu/~ggordon/ggordon.CMUCALD05112.noregret.pdf). And, on request, We can provide a draft related to extensiveform correlated equilibrium
Project 2: Implement differential dynamic programming (DDP) to find a controller that is optimal (perhaps subject to some constraints) in an interesting problem. The world would be modeled as x(t+1) = f(x(t), action(t), noise(t)), where f() is differentiable but nonlinear. There would be a reward function r(x(t), action(t)) which defines optimality. The controller would look like action(t) = A(t) (x(t)  x0(t)), where A(t) and x0(t) are parameters to be learned. DDP works by starting from some nominal trajectory x(t), action(t), and Taylor expanding f() and r() around the nominal trajectory. Then it finds the optimal controller in this linearquadratic problem using linear algebra, and updates the states and actions to be closer to what the optimal controller says.
Project 3: Implement maxmargin planning (Ratliff & Bagnell) and use it to learn how to solve an interesting class of planning problems.
Project 4: Implement Maximum margin Markov networks, and one of the solution methods: SMO, exponentiated gradient or subgradient.
Project 5: Implement one of the generalizations of Max margin Markov Nets
Learning Structured Prediction
Models: A Large Margin Approach; Ben Taskar, Vassil Chatalbashev,
Daphne Koller and Carlos Guestrin;
In the
22^{nd}
International Conference on Machine Learning (ICML 2005), Bonn, August
2005. [PS
version]
Project 6: Use the ConvexConcave Procedure to solve a nonconvex optimization problem and compare to local search techniques such as simulated annealing. A.L. Yuille and Anand Rangarajan, “The ConcaveConvex Procedure (CCCP),” Neural Computation, Vol. 15, No. 4, April 2003, pp 915936.pubs\ucla\A177_ayuille_NC2003.pdf
Project 7: use SATURATE to perform robust experimental design in Gaussian processes: A. Krause, B. McMahan, C. Guestrin, and A. Gupta. Selecting observations against adversarial objectives. In NIPS, 2007.
Project 8:
Implement Queyranne's algorithm for submodular function minimization and apply
to finding independent sets of variables, e.g., for structure learning in
graphical models. Maurice Queyranne. A combinatorial algorithm for
minimizing symmetric submodular functions. In SODA, 1995.
Project 9: Use the submodularsupermodular procedure to
solve a nonsubmodular combinatorial optimization problem: M. Narasimhan and J. Bilmes. A
submodularsupermodular procedure with applications to discriminative structure
learning. In Advances in Neural
Information Processing Systems (NIPS) 19, 2006.
[PS.GZ version] [PDF version]
Project 11:
Learn distance metrics for learning problems using semidefinite programming:
E.P. Xing, A.Y. Ng, M.I. Jordan and S. Russell,
Distance Metric Learning, with application to Clustering with sideinformation,
Advances in Neural Information Processing Systems 16
(NIPS2002),
(eds. Becker et al.) MIT Press, 521528, 2002.
ps
Project 12: $\ell_1\text{}\ell_q$ Regularized Regression and It's Applications. In this project, we consider the problem of grouped variable selection in highdimensional regression using $\ell_1\text{}\ell_q$ regularization ($1\leq q \leq \infty$), which can be viewed as a natural generalization of the $\ell_1\text{}\ell_2$ regularization (the group Lasso). We have analyzed some theoretical properties of such an estimator and implement the algorithm in Matlab using blockwise coordinate descent algorithms and nonlinear thresholding. The main purpose of this project is apply these programs on some simultation data to illustrate the corresponding theoretical results. We will also use it to analyze some realworld data, like text data and microarry data. For the reference, see http://arxiv.org/abs/0802.1517
Project 13: Simultaneous Sparse Additive
Models for Multitask Learning. Sparse additive models (SpAM) is a new
class of methods for highdimensional nonparametric regression and
classification. It combine ideas from sparse linear modeling and additive
nonparametric regression. Efficient algorithms has been developed to fit the
mdoel even when the number of covariates is larger than the sample size and has
been show to be effective on both simulation data and realworld datasets.
Current SpAM mdoels are implmented in R using a backfitting type algorithm. We
also developed an algorithm which extend the SpAM from singletask setting to
multitask setting for joint nonparametric variable selection. In this project,
we will try to apply the multitask SpAM models to both simulation data and
realworld data. For the reference, see
http://arxiv.org/abs/0711.4555
 Randomized simplex algorithm: A randomized version of the simplex algorithm was recently developed, and unlike the original simplex method, it is polynomial (in RP). See "A randomized polynomialtime simplex algorithm for linear programming" by Kelner and Spielman (2006).
 Linear program solver which deals with degenerate cases.
 Interior point algorithm for linear programming. Originally, implementations were mainly based on Karmarkar's algorithm (Karmarkar, N. "A New PolynomialTime Algorithm for Linear Programming. " Combinatorica 4, 373395, 1984.), but modern implementations are based upon Mehrotra's predictorcorrector technique (Mehrotra, S. "On the Implementation of a PrimalDual Interior Point Method." SIAM J. Optimization 2, 575601, 1992.).
 Some variant of ellipsoid algorithm, to gauge how it does in practice. Would have to carefully think about the comparison though. Do we want them to implement a simplex based algorithm as well, or just compare against cplex / matlab?

Boyd presents a modern interiorpoint method, which is also a good project
idea, or affine scaling is another possible choice.