\documentclass[12pt]{article}
%\usepackage{times}

\newcommand{\coursenumber}{10-701/15-781}
\newcommand{\coursetitle}{Machine Learning}
\newcommand{\courseterm}{Fall 2003}
\newcommand{\outdate}{November 11, 2003}
\newcommand{\duedate}{start of class November 25, 2003}
\newcommand{\doctitle}{Homework 6/Mini-project 2}

\addtolength{\textwidth}{1.00in}
\addtolength{\textheight}{1.00in}
\addtolength{\evensidemargin}{-.50in}
\addtolength{\oddsidemargin}{-0.50in}
\addtolength{\topmargin}{-.50in}

\newcommand{\vx}{{\mathbf x}}
\newcommand{\vt}{{\mathbf t}}
\newcommand{\vmu}{\mbox{\boldmath$\mu$}}
\newcommand{\vgamma}{\mbox{\boldmath$\gamma$}}

\newcommand{\bi}{\begin{enumerate}}
\newcommand{\ei}{\end{enumerate}}


\begin{document}

\begin{center}
\Large \bf \coursenumber\ \coursetitle,\ \courseterm
\end{center}
\vspace*{0.10in}
\par {\bf \doctitle}
\par {\bf Out:\ \ \outdate \hspace*{\fill} Due:\ \ \duedate}
\vspace{.1in}
\par If you have questions, please contact Ning Hu 
{\tt <ninghu+781@cs.cmu.edu>}.
\vspace{.1in}
\hrule
\vspace{.1in}

\leftskip = 14pt
\parskip = 12pt
\parindent = 0pt

You have tried formulating and solving a machine learning problem using a 
real-world data set in Mini-project 1. This time, you will have the opportunity
to try tackling a new one. You may work with any data set as you wish.
You {\em are} also allowed to follow up on your original project work, if you
have ideas for substantial new improvements or investigations.

\section*{What you will do.}

\begin{enumerate}

\item Find a partner to work on this project (two person teams are encouraged,
though you may work alone if you prefer.  No three person teams please.)

\item Choose your data set. There are several choices:

\begin{itemize}

\item You may use the data sets provided in Mini-project 1,
as long as it is not the one already used in the your previous mini project.
But if you choose to follow up on your last project work, you may still 
use the original data set.

\item There are some good sources of interesting data sets: 

\begin {itemize}

\item UCI Knowledge Discovery in Databases Archive \\
{\tt \ \  http://kdd.ics.uci.edu/}

\item UCI Machine Learning Repository \\
{\tt \ \ http://www.ics.uci.edu/\char126 mlearn/MLRepository.html}

\item StatLib---Datasets Archive \\
{\tt \ \ http://lib.stat.cmu.edu/datasets/}

\end {itemize}

\item You are allowed to generate or obtain your own datasets.

\end {itemize}

\item Define your learning task according to the data set. 

\item Perform the work.  As a guideline, we expect each student to spend 7--12
hours on this homework over the course of two weeks (remember you are working
in pairs, so you can do a fairly substantial project). 


\item Turn in a short write-up.  Your write-up should describe 
{\em precisely} your data source, your learning task(s), your learning method(s) including 
how you represented the data for input to the learner, your experiments, 
results, and any conclusions your draw from this. If you choose to follow up on 
your original project work, you must state "Follow up" work and include
a copy of your original Mini-project 1 submission. Be clear in your write-up
about what you have done that is new. 

The clarity and content of your write-up will have a primary impact on
your grade.  The reports must not be more than 5 pages, 11-point font,
including figures. Each two-person team must hand in a single write-up.

\end{enumerate}


\section*{Grading and determining when you have done enough.}

We strongly advise that you develop a baseline learning system by the 
end of week 1, then work on some interesting extension or alternative during week 2.
A project that does a solid job applying the given code and carefully
evaluating and describing it might get 75--80\% credit.  A project that in
addition pursues an interesting second approach or alternative problem might
get 90--100\% credit. 

Be creative!  Exploring your own interesting ideas and comparing them with
the baseline approaches will receive credit whether they beat the baseline or
not.

If you choose to follow up on your original project work,
the grading will depend on how you design and implement your new idea, how much
improvements it achieves comparing to your original baseline approach, and
what conclusion you can draw from the experiments.

\end{document}