8803 Machine Learning Theory, Ideas for Projects

8803 Ideas for Projects

One of the course requirements is to do a small project, which you may do individually or in a group of 2. A project might involve conducting an experiment or thinking about a theoretical problem, or trying to relate two problems. It could even just be reading 2 research papers and explaining how they relate. The end result should be a 5-10 page report, and a 10 - 15 minute presentation. Here are a few ideas for possible topics for projects. You might also want to take a look at recent COLT, ICML, or NIPS proceedings. All the recent COLT proceedings contain a few open problems, some with monetary rewards!

Project Ideas

Semi-supervised learning and related topics:

M.F. Balcan and A. Blum. A Discriminative Model for Semi-Supervised Learning. Journal of the ACM, 2010.
M.F. Balcan and A. Blum. Open Problems in Efficient Semi-Supervised PAC Learning. Open problem, COLT 2007. [Monetary reward!]
A. Carlson, J. Betteridge, R. C. Wang, E. R. Hruschka Jr., and T. M. Mitchell. Coupled Semi-Supervised Learning for Information Extraction. International Conference on Web Search and Data Mining (WSDM), 2010.
L. Xu, M. White, and D. Schuurmans. Optimal Reverse Prediction. Twenty-Sixth International Conference on Machine Learning (ICML), 2009.

Active learning:

S. Dasgupta. Coarse sample complexity bounds for active learning. Advances in Neural Information Processing Systems (NIPS), 2005.
A. Beygelzimer, S. Dasgupta, and J. Langford. Importance-weighted active learning. Twenty-Sixth International Conference on Machine Learning (ICML), 2009.
M.F. Balcan, A. Beygelzimer, J. Langford. Agnostic active learning. JCSS 2009.
S. Fine, Y. Mansour. Active Sampling for Multiple Output Identification. COLT 2006.
M.F. Balcan, S. Hanneke, and J. Wortman. The True Sample Complexity of Active Learning. Machine Learning Journal 2010.
S. Hanneke's thesis Theoretical Foundations of Active Learning. CMU 2009
See also the NIPS 2009 Workshop on Adaptive Sensing, Active, Learning and Experimental Design: Theory, Methods, and Applications.

Clustering and related topics:

R. Kannan, H. Salmasian, and S. Vempala. "The Spectral Method for General Mixture Models". COLT 2005.
D. Achlioptas and F. McSherry. "On Spectral Learning of Mixtures of Distributions". COLT 2005.
M.F. Balcan, A. Blum, and S. Vempala. A Discriminative Framework for Clustering via Similarity Functions. STOC 2008. See also full journal submission.
M.F. Balcan, A. Blum, and A. Gupta. A Approximate Clustering without the Approximation. SODA 2009.
P. Awasthi, A. Blum, and O. Sheffet. Clustering under Natural Stability Assumptions. Manuscript 2010.
M.F. Balcan.Better Guarantees for Sparsest Cut Clustering. Open problem, COLT 2009.
See also the NIPS 2009 Workshop on Clustering: Science or Art? Towards Principled Approaches .

Relationship between convex cost functions and discrete loss: These papers look at relationships between different kinds of objective functions for learning problems.

P.L. Bartlett, M.I. Jordan, J.D. McAuliffe. Convexity, classification, and risk bounds. Journal of the American Statistical Association, 2006.
T.Zhang, Statistical behavior and consistency of classification methods based on convex risk minimization.
I.Steinwart, How to compare different loss functions and their risks.

Boosting related topics:

V. Feldman. Distribution-Specific Agnostic Boosting. Innovations in Computer Science (ICS), 2010.
L. Reyzin and R. E. Schapire. How Boosting the Margin Can Also Boost Classifier Complexity. ICML 2006.
A. Tauman Kalai and R. Servedio. Boosting in the Presence of Noise . JCSS 2005.

Efficient agnostic learning:

P. Awasthi, A. Blum, and O. Sheffet. Improved Guarantees for Agnostic Learning of Disjunctions. Manuscript 2010.
A. Kalai, A. Klivans, Y. Mansour, and R. Servedio. Agnostically Learning Halfspaces. FOCS 2005.
W. S. Lee, P. L. Bartlett, and R. C. Williamson. Efficient agnostic learning of neural networks with bounded fan-in. IEEE Trans Info Theory, 1996.

Learning with kernel functions:

M. Warmuth and S. V. N. Vishwanathan. Leaving the Span. COLT 2005.
M. F. Balcan, A. Blum, and N. Srebro. A Theory of Learning with Similarity Functions. Machine Learning Journal, 2008.
N. Srebro and S. Ben-David. Learning Bounds for Support Vector Machines with Learned Kernels. 19th Annual Conference on Learning Theory (COLT), 2006.
G. Lanckriet, N. Cristianini, P. Bartlett, and Laurent El Ghaoui. Learning the Kernel Matrix with Semidefinite Programming, Journal of Machine Learning Research 2004.

Learning in Markov Decision Processes: See M. Kearns's home page and Y. Mansour's home page for a number of good papers. Also S. Kakade's thesis.

PAC-Bayes bounds, shell-bounds, other methods of obtaining confidence bounds. Some papers:

D. McAllester. Simplified PAC-BAyesian Margin Bounds. COLT 2003.
A. Ambroladze, E. Parrado-Hernandez, and J. Shawe-Taylor. Tighter PAC-Bayes Bounds. NIPS 2006.
J. Langford and D. McAllester. Computable Shell Decomposition Bounds. COLT 2000.
S. Mendelson and P. Philips. On the Importance of Small Coordinate Projection. Journal of Machine Learning Research 2004.

Learning in Graphical Models (Bayes Nets)