CS 7545 Ideas for Projects
One of the course requirements is to do a project, which you may
do individually or in a group of 2.
-
You could choose to do a small project (if you prefer the homework oriented grading scheme): this might involve conducting
a small experiment or reading a
couple of papers and presenting the main ideas. The end result should be
a 3-5 page
report, and a 10-15 minute presentation.
-
Alternatively, you could choose to do a larger project (if you prefer
the project oriented grading scheme): this might
involve conducting
a novel experiment, or thinking about a concrete open theoretical
question, or thinking about how to formalize an interesting new topic, or trying to
relate several problems.
The end result should be a 10-15 page
report, and a 40-45 minute presentation.
Here are a few ideas for
possible topics for projects. You might also want to take a look at
recent COLT, ICML,
or
NIPS
proceedings. All the recent COLT proceedings contain a few open
problems, some with monetary rewards!
Project Ideas
Machine learning lenses in other areas:
- M.F. Balcan and N. Harvey.
Submodular Functions: Learnability, Structure, and Optimization. STOC 2011.
- M.F. Balcan, E. Blais, A. Blum, and L. Yang.
Active Property Testing. FOCS 2012.
- M.F. Balcan, A. Blum, J. Hartline, and Y. Mansour.
Mechanism Design via Machine Learning. FOCS 2005.
- A. Blum and Y. Mansour.
Learning, Regret Minimization, and Equilibria. Book chapter in Algorithmic Game Theory, Noam Nisan, Tim Roughgarden, Eva Tardos, and Vijay Vazirani, eds.
- Y. Chen and J. Wortman Vaughan.
Connections Between Markets and Learning. ACM SIGecom Exchanges 2010.
Distributed machine learning:
- M.F. Balcan, A. Blum, S. Fine, and Y. Mansour.
Distributed Learning, Communication Complexity and Privacy. COLT 2012.
- H. Daume III, J. M. Phillips, A. Saha, and S. Venkatasubramanian.
Distributed Learning, Communication Complexity and Privacy. ALT 2012.
- M.F. Balcan, S. Ehrlich, and Y. Liang.
Distributed Clustering on Graphs. NIPS 2013.
Semi-supervised learning and related topics:
- M.F. Balcan and A. Blum.
A Discriminative Model for Semi-Supervised Learning. Journal of
the ACM, 2010.
- M.F. Balcan, A. Blum, and Y. Mansour.
Exploiting Ontology Structures and Unlabeled Data for Learning. ICML 2013.
- X. Zhu.
Semi-Supervised Learning. Encyclopedia of Machine Learning.
- A. Carlson, J. Betteridge, R. C. Wang, E. R. Hruschka Jr., and T.
M. Mitchell. Coupled
Semi-Supervised
Learning
for
Information
Extraction. International
Conference on Web Search and Data Mining (WSDM), 2010.
- L. Xu, M. White, and D. Schuurmans.
Optimal Reverse Prediction. Twenty-Sixth International Conference
on Machine
Learning (ICML), 2009.
- X. Zhu, Z. Ghahramani, and J. Lafferty.
Semi-supervised learning using Gaussian fields and harmonic functions.
The
20th
International
Conference
on Machine Learning (ICML)
2003.
Interactive learning:
- S. Dasgupta. Coarse
sample complexity bounds for active learning. Advances in Neural
Information Processing Systems (NIPS), 2005.
- M.F. Balcan, A. Beygelzimer, J. Langford. Agnostic
active
learning. JCSS 2009 (originally in ICML 2006).
- A. Beygelzimer, S. Dasgupta, and J. Langford. Importance-weighted
active
learning. ICML 2009.
- M.F. Balcan, S. Hanneke, and J. Wortman. The
True
Sample
Complexity
of
Active
Learning. Machine Learning
Journal 2010.
- D. Hsu's PhD thesis
Algorithms for active learning. UCSD 2010.
- V. Koltchinskii
Rademacher Complexities and Bounding the Excess Risk in Active Learning. Journal of Machine Learning Research 2010.
- Y. Wiener and R. El-Yaniv. Agnostic Selective Classification. NIPS 2011.
- S. Hanneke
Rates of Convergence in Active Learning. The Annals of Statistics 2011.
- N. Ailon, R. Begleiter, and E. Ezra. Active
Active learning using smooth relative regret approximations with applications. COLT 2012.
- M.F. Balcan and S. Hanneke. Active
Robust Interactive Learning. COLT 2012.
- M.F. Balcan and P. Long. Active
Active and Passive Learning of Linear Separators under Log-concave Distributions. COLT 2013.
- See also the NIPS
2009 Workshop on
Adaptive
Sensing,
Active,
Learning and
Experimental Design:
Theory, Methods, and Applications.
Noise tolerant computationally efficient algorithms:
Clustering and related topics:
Multiclass classification:
- A. Daniely, S. Sabato, and S. Shalev-Shwartz. Multiclass Learning Approaches: A Theoretical Comparison with Implications.
NIPS 2012
- A. Daniely, S. Sabato, S. Ben-David, and S. Shalev-Shwartz. Multiclass Learnability and the ERM Principle.
COLT 2011.
Relationship between convex cost functions and discrete loss:
These papers look at relationships between different kinds of objective
functions for learning problems.
Boosting related topics:
Learning with kernel functions:
Learning in Markov Decision Processes: See M. Kearns's home
page and Y.
Mansour's home page for a number of good papers. Also S.
Kakade's
thesis.
PAC-Bayes bounds, shell-bounds, other methods of obtaining
confidence bounds. Some papers:
Learning in Graphical Models (Bayes Nets)