Computer Science Thesis Proposal
- Gates Hillman Centers
- TRAVIS DICK
- Ph.D. Student
- Computer Science Department
- Carnegie Mellon University
Machine Learning: Social Values, Data Efficiency, and Beyond Prediction
This thesis builds on the theory and practice of Machine Learning to accommodate several modern requirements of learning systems. We focus on requirements stemming from three sources: making the best use of available data, learning problems beyond standard prediction, and incorporating social values.
Label-efficient Multi-class Learning: Large scale multi-class learning tasks with an abundance of unlabeled data are ubiquitous in machine learning. The first chapter of my thesis focuses on theory for large-scale multi-class learning with limited labeled data. We begin by assuming that a given supervised learning algorithm would succeed at the learning task if it had access to labeled data. Then we use the implicit assumptions made by that algorithm to show that different label-efficient algorithms will also succeed.
Beyond Standard Prediction Problems: While most machine learning focuses on making predictions, there are learning problems where the output of the learner is not a prediction rule. We focus on datadriven algorithm configuration, where the goal is to find the best algorithm parameters for a specific application domain. We consider this problem in two new learning settings: the online setting, where problems are chosen by an adversary and arrive one at a time, and the private setting, where problems encode sensitive information. Algorithm configuration often reduces to maximizing a collection of piecewise Lipschitz functions. In both settings, optimization is impossible in the worst case. Our main contribution is a condition - dispersion - that allows for meaningful regret bounds and utility guarantees. We also show that dispersion is satisfied for many problems under mild assumptions.
Social Values: Machine learning is becoming central to the infrastructure of our society. This creates exciting possibilities for profoundly positive impacts in our lives though improvements to, for example, medicine, communication, and transportation. Since these systems are often not explicitly designed with social values in mind, there is a risk that their adoption could result in undesired outcomes such as privacy violations or unfair treatment of individuals. The final chapter of my thesis focuses on principled techniques for incorporating two social values into machine learning algorithms: privacy and fairness.
Maria-Florina Balcan (Chair)
Yishay Mansour (Tel Aviv University)