- You could choose to do a small project (if you prefer the homework oriented grading scheme): this might involve conducting a small experiment or reading a couple of papers and presenting the main ideas. The end result should be a 3-5 page report.
- Alternatively, you could choose to do a larger project (if you prefer the project oriented grading scheme): this might involve conducting a novel larger experiment, or thinking about a concrete open theoretical question, or thinking about how to formalize an interesting new topic, or trying to relate several problems. The end result should be a 10-15 page report.
- We will also have a day for project presentations (format TBD).

- A. Daniely. A PTAS for Agnostically Learning Halfspaces. COLT 2015.
- U. Feige, Y. Mansour, R. Schapire. Learning and inference in the presence of corrupted inputs. COLT 2015.
- P. Awasthi, M.F. Balcan and P. Long. The Power of Localization for Efficiently Learning Linear Separators with Noise. STOC 2014.
- P. M. Long and R. A. Servedio. Learning large-margin halfspaces with more malicious noise. NIPS 2011.
- P. Awasthi, A. Blum, and O. Sheffet. Improved Guarantees for Agnostic Learning of Disjunctions. COLT 2010.
- A. Kalai, A. Klivans, Y. Mansour, and R. Servedio. Agnostically Learning Halfspaces. FOCS 2005.
- W. S. Lee, P. L. Bartlett, and R. C. Williamson. Efficient agnostic learning of neural networks with bounded fan-in. IEEE Trans Info Theory, 1996.
- M. Kearns and M. Li. Learning in the Presence of Malicious Errors. STOC 1988.

- K. Amin, R. Cummings, L. Dworkin, M. Kearns, and A. Roth. Online Learning and Profit Maximization from Revealed Preferences. AAAI 2015.
- J. Morgenstern and T. Roughgarden. The Pseudo-Dimension of Nearly-Optimal Auctions. NIPS 2015.
- M.F. Balcan, A. Daniely, R. Mehta, R. Urner, V. Vazirani: Learning Economic Parameters from Revealed Preferences. WINE 2014.
- R. Cole, T. Roughgarden: The sample complexity of revenue maximization. STOC 2014.
- M.F. Balcan and N. Harvey. Submodular Functions: Learnability, Structure, and Optimization. STOC 2011.
- M.F. Balcan, E. Blais, A. Blum, and L. Yang. Active Property Testing. FOCS 2012.
- Y. Chen and J. Wortman Vaughan. Connections Between Markets and Learning. ACM SIGecom Exchanges 2010.
- M.F. Balcan, A. Blum, J. Hartline, and Y. Mansour. Mechanism Design via Machine Learning. FOCS 2005.
- A. Blum and Y. Mansour. Learning, Regret Minimization, and Equilibria. Book chapter in Algorithmic Game Theory, Noam Nisan, Tim Roughgarden, Eva Tardos, and Vijay Vazirani, eds.

- Y. Arjevani and O. Shamir. Communication Complexity of Distributed Convex Learning and Optimization. NIPS 2015.
- Y. Zhang, M. Wainwright and M. Jordan. Distributed Estimation of Generalized Matrix Rank: Efficient Algorithms and Lower Bounds. ICML 2015.
- O. Shamir, N. Srebro and T. Zhang. Communication Efficient Distributed Optimization using an Approximate Newton-type Method. ICML 2014.
- J.C. Duchi, M. Wainwright, and Y. Zhang. Communication-Efficient Algorithms for Statistical Optimization . JMLR 2013.
- M.F. Balcan, S. Ehrlich, and Y. Liang. Distributed Clustering on Graphs. NIPS 2013.
- M.F. Balcan, A. Blum, S. Fine, and Y. Mansour. Distributed Learning, Communication Complexity and Privacy. COLT 2012.

- A. Balsubramani and Y. Freund. Optimally Combining Classifiers Using Unlabeled Data. COLT 2015.
- M.F. Balcan, A. Blum, and Y. Mansour. Exploiting Ontology Structures and Unlabeled Data for Learning. ICML 2013.
- Z. Karnin, E. Liberty, S. Lovett, R. Schwartz and O. Weinstein.Unsupervised SVMs: On the Complexity of the Furthest Hyperplane Problem. COLT, 2012.
- M.F. Balcan and A. Blum. A Discriminative Model for Semi-Supervised Learning. Journal of the ACM, 2010.
- A. Carlson, J. Betteridge, R. C. Wang, E. R. Hruschka Jr., and T. M. Mitchell. Coupled Semi-Supervised Learning for Information Extraction. International Conference on Web Search and Data Mining (WSDM), 2010.
- X. Zhu. Semi-Supervised Learning. Encyclopedia of Machine Learning.
- L. Xu, M. White, and D. Schuurmans. Optimal Reverse Prediction. Twenty-Sixth International Conference on Machine Learning (ICML), 2009.
- X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using Gaussian fields and harmonic functions. The 20th International Conference on Machine Learning (ICML) 2003.

- K. Chaudhuri, S. Kakade, P. Netrapalli and S. Sanghavi. Convergence Rates of Active Learning for Maximum Likelihood Estimation. NIPS 2015.
- G. Dasarathy, R. Nowak and X. Zhu. S2: An Efficient Graph Based Active Learning Algorithm with Application to Nonparametric Classification. COLT 2015.
- T.-K. Huang, A. Agarwal, D. Hsu, J. Langford, and R. Schapire, Efficient and Parsimonious Agnostic Active Learning. NIPS 2015.
- S. Hanneke. Theory of Disagreement-Based Active Learning. Foundations and Trends in Machine Learning, Vol. 7 (2-3), pp. 131-309.
- M.F. Balcan and P. Long. Active and Passive Learning of Linear Separators under Log-concave Distributions. COLT 2013.
- M.F. Balcan and S. Hanneke. Robust Interactive Learning. COLT 2012.
- M.F. Balcan, A. Beygelzimer, J. Langford. Agnostic active learning. JCSS 2009 (originally in ICML 2006).
- A. Beygelzimer, S. Dasgupta, and J. Langford. Importance-weighted active learning. ICML 2009.
- V. Koltchinskii Rademacher Complexities and Bounding the Excess Risk in Active Learning. Journal of Machine Learning Research 2010.
- S. Hanneke Rates of Convergence in Active Learning. The Annals of Statistics 2011.
- See also the ICML 2015 Workshop on Advances in Active Learning - Bridging Theory and Practice.

- H. Simon. An Almost Optimal PAC Algorithm. COLT 2015.
- H. Simon and S. Zilles. Open Problem: Recursive Teaching Dimension Versus VC Dimension. COLT 2015 (open problems track).
- A. Ambroladze, E. Parrado-Hernandez, and J. Shawe-Taylor. Tighter PAC-Bayes Bounds. NIPS 2006.
- D. McAllester. Simplified PAC-BAyesian Margin Bounds. COLT 2003.
- S. Mendelson and P. Philips. On the Importance of Small Coordinate Projection. Journal of Machine Learning Research 2004.
- J. Langford and D. McAllester. Computable Shell Decomposition Bounds. COLT 2000.

- O. Dekel, J. Ding, T. Koren, Y. Peres: Bandits with switching costs: T^(2/3) regret. STOC 2014.
- U. Feige, T. Koren, M. Tennenholtz: Chasing Ghosts: Competing with Stateful Policies. FOCS 2014.
- Y. Mansour, A. Slivkins, V. Syrgkanis: Bayesian Incentive-Compatible Bandit Exploration. EC 2015.
- A. Rakhlin, K. Sridharan: Online Learning with Predictable Sequences. COLT 2013.
- A. Rakhlin, K. Sridharan: Optimization, Learning, and Games with Predictable Sequences. NIPS 2013.

- J. Eldridge, M. Belkin, Y. Wang. Beyond Hartigan Consistency: Merge Distortion Metric for Hierarchical Clustering. COLT 2015.
- P. Awasthi and M.F. Balcan. Center Based Clustering: A Foundational Perspective. Book Chapter in Handbook of Cluster Analysis, 2015.
- Y. Li, K. He, D. Bindel, and J.E. Hopcroft. Uncovering the Small Community Structure in Large Networks: A Local Spectral Approach. WWW 2015.
- M. Ackerman and S. Dasgupta. Incremental clustering: the case for extra clusters. NIPS 2014.
- M.F. Balcan, C. Borgs, M. Braverman, J. Chayes, and S. Teng. Finding Endogenously Formed Communities. SODA 2013.
- D. Hsu and S. M. Kakade. Learning Mixtures of Spherical Gaussians: Moment Methods and Spectral Decompositions. ITCS 2013.
- S. Vassilvitskii and S. Venkatasubramanian. New Developments In The Theory Of Clustering. Tutorial at KDD 2010.
- M.F. Balcan and P. Gupta. Robust Hierarchical Clustering. COLT 2010.
- M.F. Balcan, A. Blum, and S. Vempala. A Discriminative Framework for Clustering via Similarity Functions. STOC 2008. See also full version.

- A. Daniely, N. Linial and S. Shalev-Shwartz. From average case complexity to improper learning complexity. STOC 2014. See also related 1, related 2, related 3.
- S. Khot and R. Saket. On Hardness of learning intersection of two halfspaces. JCSS 2011. See also related 1

- A. Daniely and S. Shalev-Shwartz. Optimal Learners for Multiclass Problems. COLT 2014.
- A. Daniely, S. Sabato, and S. Shalev-Shwartz. Multiclass Learning Approaches: A Theoretical Comparison with Implications. NIPS 2012
- A. Daniely, S. Sabato, S. Ben-David, and S. Shalev-Shwartz. Multiclass Learnability and the ERM Principle. COLT 2011.

- P. L. Bartlett, M.I. Jordan, J.D. McAuliffe. Convexity, classification, and risk bounds. Journal of the American Statistical Association, 2006.
- T. Zhang, Statistical behavior and consistency of classification methods based on convex risk minimization.
- I. Steinwart, How to compare different loss functions and their risks.

- A. Beygelzimer, S. Kale and H. Luo. Optimal and Adaptive Algorithms for Online Boosting. ICML 2015.
- I. Mukherjee and R. E. Schapire. A theory of multiclass boosting. Journal of Machine Learning Research 2013
- P. M. Long and R. A. Servedio. Random classification noise defeats all convex potential boosters. Machine Learning, 2010
- V. Feldman. Distribution-Specific Agnostic Boosting. Innovations in Computer Science (ICS), 2010.
- A. Tauman Kalai and R. Servedio. Boosting in the Presence of Noise . JCSS 2005.
- S. Dasgupta and P. M. Long. Boosting with diverse base classifiers. COLT 2003.

- M. Warmuth and S. V. N. Vishwanathan. Leaving the Span. COLT 2005.
- M. F. Balcan, A. Blum, and N. Srebro. A Theory of Learning with Similarity Functions. Machine Learning Journal, 2008.
- N. Srebro and S. Ben-David. Learning Bounds for Support Vector Machines with Learned Kernels. 19th Annual Conference on Learning Theory (COLT), 2006.
- G. Lanckriet, N. Cristianini, P. Bartlett, and Laurent El Ghaoui. Learning the Kernel Matrix with Semidefinite Programming, Journal of Machine Learning Research 2004.

**Learning in Graphical Models (Bayes Nets)**