CS 7545 Ideas for Projects 
One of the course requirements is to do a project, which you may
do individually or in a group of 2. 
- 
You could choose to do a small project (if you prefer the homework oriented grading scheme): this  might involve conducting
a small experiment or reading  a 
couple of papers and presenting the main ideas. The end result should be
 a 3-5 page
report, and a 10-15 minute presentation. 
- 
Alternatively, you could choose to do a larger project (if you prefer 
the project oriented grading scheme): this might 
involve conducting
a novel experiment,  or thinking about a concrete  open theoretical 
question, or  thinking about how to formalize an interesting new topic, or trying to
relate several problems. 
The end result should be a 10-15 page
report, and a 40-45 minute presentation. 
Here are a few ideas for
possible topics for projects. You might also want to take a look at
recent COLT, ICML,
or
NIPS
proceedings. All the recent COLT proceedings contain a few open
problems, some with monetary rewards!Project Ideas
 Machine learning lenses in other areas:
- M.F. Balcan and N. Harvey. 
 Submodular Functions: Learnability, Structure, and Optimization. STOC 2011.
- M.F. Balcan,  E. Blais, A. Blum, and L. Yang. 
Active Property Testing. FOCS 2012. 
- M.F. Balcan, A. Blum, J. Hartline, and Y. Mansour. 
Mechanism Design via Machine Learning. FOCS 2005.
- A. Blum and Y. Mansour. 
Learning, Regret Minimization, and Equilibria. Book chapter in Algorithmic Game Theory, Noam Nisan, Tim Roughgarden, Eva Tardos, and Vijay Vazirani, eds.
-  Y. Chen and J. Wortman Vaughan. 
 Connections Between Markets and Learning. ACM SIGecom Exchanges 2010.
Distributed machine learning:
- M.F. Balcan, A. Blum, S. Fine, and Y. Mansour. 
Distributed Learning, Communication Complexity and Privacy. COLT 2012.
- H. Daume III, J. M. Phillips, A. Saha, and S. Venkatasubramanian. 
Distributed Learning, Communication Complexity and Privacy. ALT 2012.
- M.F. Balcan, S. Ehrlich, and Y. Liang. 
 Distributed Clustering on Graphs. NIPS 2013.
Semi-supervised learning and related topics:
  - M.F. Balcan and A. Blum. 
A Discriminative Model for Semi-Supervised Learning. Journal of
the ACM, 2010.
- M.F. Balcan, A. Blum, and Y. Mansour.
Exploiting Ontology Structures and Unlabeled Data for Learning. ICML 2013.
- X. Zhu.
Semi-Supervised Learning. Encyclopedia of Machine Learning.
- A. Carlson, J. Betteridge, R. C. Wang, E. R. Hruschka Jr., and T.
M. Mitchell. Coupled
Semi-Supervised
Learning
for
Information
Extraction. International
Conference on Web Search and Data Mining (WSDM), 2010.
-  L. Xu, M. White, and D. Schuurmans. 
Optimal Reverse Prediction. Twenty-Sixth International Conference
on Machine
Learning (ICML), 2009.
- X. Zhu, Z. Ghahramani, and J. Lafferty.
Semi-supervised learning using Gaussian fields and harmonic functions.
The
20th
International
Conference
on Machine Learning (ICML)
2003. 
Interactive learning:
 - S. Dasgupta. Coarse
sample complexity bounds for active learning. Advances in Neural
Information Processing Systems (NIPS), 2005. 
- M.F. Balcan, A. Beygelzimer, J. Langford. Agnostic
active
learning. JCSS 2009 (originally in ICML 2006). 
- A. Beygelzimer, S. Dasgupta, and J. Langford. Importance-weighted
active
learning. ICML 2009. 
- M.F. Balcan, S. Hanneke, and J. Wortman. The
True
Sample
Complexity
of
Active
Learning. Machine Learning
Journal 2010. 
- D. Hsu's PhD thesis
Algorithms for active learning. UCSD 2010. 
- V. Koltchinskii
Rademacher Complexities and Bounding the Excess Risk in Active Learning. Journal of Machine Learning Research 2010. 
- Y. Wiener and R. El-Yaniv. Agnostic Selective Classification. NIPS 2011. 
- S. Hanneke
 Rates of Convergence in Active Learning. The Annals of Statistics 2011. 
- N. Ailon, R. Begleiter, and E. Ezra. Active
Active learning using smooth relative regret approximations with applications. COLT 2012. 
- M.F. Balcan and S. Hanneke. Active
Robust Interactive Learning. COLT 2012. 
- M.F. Balcan and P. Long. Active
Active and Passive Learning of Linear Separators under Log-concave Distributions. COLT 2013. 
- See also the NIPS
2009 Workshop on
Adaptive
Sensing,
Active,
Learning and
Experimental Design:
Theory, Methods, and Applications. 
Noise tolerant computationally efficient algorithms:
Clustering and related topics:
 Multiclass classification:
- A. Daniely, S. Sabato, and S. Shalev-Shwartz. Multiclass Learning Approaches: A Theoretical Comparison with Implications.
NIPS 2012
- A. Daniely, S. Sabato, S. Ben-David, and S. Shalev-Shwartz. Multiclass Learnability and the ERM Principle.
COLT 2011.
Relationship between convex cost functions and discrete loss:
These papers look at relationships between different kinds of objective
functions for learning problems.
Boosting related topics:
Learning with kernel functions:
Learning in Markov Decision Processes: See M. Kearns's home
page and Y.
Mansour's home page for a number of good papers. Also S.
Kakade's
thesis.PAC-Bayes bounds, shell-bounds, other methods of obtaining
confidence bounds. Some papers: 
 Learning in Graphical Models (Bayes Nets)