CMU 10-806 Foundations of Machine Learning and Data
Science, Fall 2015
Ideas for Projects
One of the course requirements is to do a project, which you may
do individually or in a group of 2 to 3.
Here are a few ideas for
possible topics for projects. You might also want to take a look at
recent COLT, ICML,
proceedings. All the recent COLT proceedings contain a few open
problems, some with monetary rewards!
You could choose to do a small project (if you prefer the homework oriented grading scheme): this might involve conducting
a small experiment or reading a
couple of papers and presenting the main ideas. The end result should be
a 3-5 page report.
Alternatively, you could choose to do a larger project (if you prefer
the project oriented grading scheme): this might
a novel larger experiment, or thinking about a concrete open theoretical
question, or thinking about how to formalize an interesting new topic, or trying to
relate several problems.
The end result should be a 10-15 page
- We will also have a day for project presentations (format TBD).
Noise tolerant computationally efficient algorithms:
Machine learning lenses in other areas:
- A. Daniely. A PTAS for Agnostically Learning Halfspaces. COLT 2015.
- U. Feige, Y. Mansour, R. Schapire. Learning and inference in the presence of corrupted inputs. COLT 2015.
- P. Awasthi, M.F. Balcan and P. Long. The Power of
Localization for Efficiently Learning Linear Separators with
Noise. STOC 2014.
- P. M. Long and R. A. Servedio. Learning large-margin halfspaces with more malicious noise. NIPS 2011.
- P. Awasthi, A. Blum, and O. Sheffet. Improved
Guarantees for Agnostic Learning of Disjunctions.
- A. Kalai, A. Klivans, Y. Mansour, and R. Servedio. Agnostically
Halfspaces. FOCS 2005.
- W. S. Lee, P. L. Bartlett, and R. C. Williamson. Efficient
Trans Info Theory, 1996.
- M. Kearns and M. Li. Learning in the Presence of Malicious Errors.
Distributed machine learning:
- K. Amin, R. Cummings, L. Dworkin, M. Kearns, and A. Roth. Online Learning and Profit Maximization from Revealed Preferences. AAAI 2015.
- J. Morgenstern and T. Roughgarden. The Pseudo-Dimension of Nearly-Optimal Auctions. NIPS 2015.
- M.F. Balcan, A. Daniely, R. Mehta, R. Urner, V. Vazirani:
Parameters from Revealed Preferences. WINE 2014.
- R. Cole, T. Roughgarden: The
sample complexity of revenue maximization. STOC 2014.
- M.F. Balcan and N. Harvey.
Submodular Functions: Learnability, Structure, and Optimization. STOC 2011.
- M.F. Balcan, E. Blais, A. Blum, and L. Yang.
Active Property Testing. FOCS 2012.
- Y. Chen and J. Wortman Vaughan.
Connections Between Markets and Learning. ACM SIGecom Exchanges 2010.
- M.F. Balcan, A. Blum, J. Hartline, and Y. Mansour.
Mechanism Design via Machine Learning. FOCS 2005.
- A. Blum and Y. Mansour.
Learning, Regret Minimization, and Equilibria. Book chapter in Algorithmic Game Theory, Noam Nisan, Tim Roughgarden, Eva Tardos, and Vijay Vazirani, eds.
Semi-supervised learning and related topics:
- Y. Arjevani and O. Shamir. Communication Complexity of Distributed Convex Learning and Optimization. NIPS 2015.
- Y. Zhang, M. Wainwright and M. Jordan. Distributed Estimation of Generalized Matrix Rank: Efficient Algorithms and Lower Bounds. ICML 2015.
- O. Shamir, N. Srebro and T. Zhang. Communication Efficient Distributed Optimization using an Approximate Newton-type Method. ICML 2014.
- J.C. Duchi, M. Wainwright, and Y. Zhang. Communication-Efficient Algorithms for Statistical Optimization . JMLR 2013.
- M.F. Balcan, S. Ehrlich, and Y. Liang. Distributed Clustering on Graphs. NIPS 2013.
- M.F. Balcan, A. Blum, S. Fine, and Y. Mansour. Distributed Learning, Communication Complexity and Privacy. COLT 2012.
- A. Balsubramani and Y. Freund.
Optimally Combining Classifiers Using Unlabeled Data. COLT 2015.
- M.F. Balcan, A. Blum, and Y. Mansour.
Exploiting Ontology Structures and Unlabeled Data for Learning. ICML 2013.
- Z. Karnin, E. Liberty, S. Lovett, R. Schwartz and
O. Weinstein.Unsupervised SVMs: On the Complexity of the Furthest Hyperplane Problem. COLT, 2012.
- M.F. Balcan and A. Blum.
A Discriminative Model for Semi-Supervised Learning. Journal of
the ACM, 2010.
- A. Carlson, J. Betteridge, R. C. Wang, E. R. Hruschka Jr., and T.
M. Mitchell. Coupled
Conference on Web Search and Data Mining (WSDM), 2010.
- X. Zhu.
Semi-Supervised Learning. Encyclopedia of Machine Learning.
- L. Xu, M. White, and D. Schuurmans.
Optimal Reverse Prediction. Twenty-Sixth International Conference
Learning (ICML), 2009.
- X. Zhu, Z. Ghahramani, and J. Lafferty.
Semi-supervised learning using Gaussian fields and harmonic functions.
on Machine Learning (ICML)
VC-dimension and other methods of obtaining sample size bounds:
- K. Chaudhuri, S. Kakade, P. Netrapalli and S. Sanghavi. Convergence Rates of Active Learning for Maximum Likelihood Estimation. NIPS 2015.
- G. Dasarathy, R. Nowak and X. Zhu. S2: An Efficient Graph Based Active Learning Algorithm with Application to Nonparametric Classification. COLT 2015.
- T.-K. Huang, A. Agarwal, D. Hsu, J. Langford, and R. Schapire, Efficient and Parsimonious Agnostic Active Learning. NIPS 2015.
- S. Hanneke. Theory of Disagreement-Based Active Learning.
Foundations and Trends in Machine Learning, Vol. 7 (2-3), pp. 131-309.
- M.F. Balcan and P. Long. Active and Passive Learning of Linear Separators under Log-concave Distributions. COLT 2013.
- M.F. Balcan and S. Hanneke. Robust Interactive Learning. COLT 2012.
- M.F. Balcan, A. Beygelzimer, J. Langford. Agnostic active learning. JCSS 2009 (originally in ICML 2006).
- A. Beygelzimer, S. Dasgupta, and J. Langford. Importance-weighted
learning. ICML 2009.
- V. Koltchinskii
Rademacher Complexities and Bounding the Excess Risk in Active Learning. Journal of Machine Learning Research 2010.
- S. Hanneke
Rates of Convergence in Active Learning. The Annals of Statistics 2011.
- See also the ICML 2015 Workshop on Advances in Active Learning -
Bridging Theory and Practice.
Clustering and related topics:
- O. Dekel, J. Ding, T. Koren, Y. Peres: Bandits with switching costs: T^(2/3) regret. STOC 2014.
- U. Feige, T. Koren, M. Tennenholtz: Chasing Ghosts:
Competing with Stateful Policies. FOCS 2014.
- Y. Mansour, A. Slivkins, V. Syrgkanis: Bayesian
Incentive-Compatible Bandit Exploration. EC 2015.
- A. Rakhlin, K. Sridharan: Online Learning with
Predictable Sequences. COLT 2013.
- A. Rakhlin, K. Sridharan: Optimization, Learning, and
Games with Predictable Sequences. NIPS 2013.
Hardness of learning:
- J. Eldridge, M. Belkin, Y. Wang. Beyond Hartigan Consistency: Merge Distortion Metric for Hierarchical Clustering. COLT 2015.
- P. Awasthi and M.F. Balcan. Center Based Clustering: A Foundational Perspective. Book Chapter in Handbook of Cluster Analysis, 2015.
- Y. Li, K. He, D. Bindel, and J.E. Hopcroft. Uncovering the Small Community Structure in Large Networks: A Local Spectral Approach. WWW 2015.
- M. Ackerman and S. Dasgupta. Incremental clustering: the case for extra clusters. NIPS 2014.
- M.F. Balcan, C. Borgs, M. Braverman, J. Chayes, and S. Teng. Finding Endogenously Formed Communities. SODA 2013.
- D. Hsu and S. M. Kakade. Learning Mixtures of Spherical Gaussians: Moment Methods and Spectral Decompositions. ITCS 2013.
- S. Vassilvitskii and S. Venkatasubramanian. New Developments In The Theory Of Clustering. Tutorial at KDD 2010.
- M.F. Balcan and P. Gupta. Robust Hierarchical Clustering. COLT 2010.
- M.F. Balcan, A. Blum, and S. Vempala. A Discriminative Framework for Clustering via Similarity Functions. STOC 2008. See also full version.
Relationship between convex cost functions and discrete loss:
These papers look at relationships between different kinds of objective
functions for learning problems.
Boosting related topics:
Learning with kernel functions:
Learning in Markov Decision Processes: See M. Kearns's home
page and Y.
Mansour's home page for a number of good papers. Also S.
- A. Daniely and S. Shalev-Shwartz. Optimal Learners for Multiclass Problems. COLT 2014.
- A. Daniely, S. Sabato, and S. Shalev-Shwartz. Multiclass Learning Approaches: A Theoretical Comparison with Implications.
- A. Daniely, S. Sabato, S. Ben-David, and S. Shalev-Shwartz. Multiclass Learnability and the ERM Principle.
Learning in Graphical Models (Bayes Nets)