Machine Learning / Duolingo Seminar
- Remote Access - Zoom
- Virtual Presentation - ET (Special Time)
- KAMALIKA CHAUDHURI
- Associate Professor
- Department of Computer Science and Engineering
- University of California, San Diego
Challenges in Reliable Machine Learning
As machine learning is increasingly deployed, there is a need for reliable and robust methods that go beyond simple test accuracy. In this talk, we will discuss two such challenges. The first is robustness to adversarial examples, that are small perturbations to legitimate test inputs that cause misclassification. The problem is currently formalized as one of ensuring that a classifier predicts the same label in a ball of radius r around data drawn from the underlying distribution; ensuring this however may result in loss of accuracy as well as inadequate robustness. In this talk, we'll propose a different, adaptive formulation for robustness, and show that some common nonparametric classifiers satisfy our formulation in the large sample limit.
The second problem is overfitting, that many generative models are known to be prone to. Motivated by privacy concerns, we formalize a form of overfitting that we call data-copying -- where the generative model memorizes and outputs training samples or small variations thereof. We provide a three sample test for detecting data-copying, and study the performance of our test on several canonical models and datasets.
Based on joint work with Robi Bhattacharjee, Casey Meehan and Sanjoy Dasgupta.
Kamalika Chaudhuri is currently an Associate Professor at the University of California, San Diego. She received a Bachelor of Technology degree in Computer Science and Engineering in 2002 from Indian Institute of Technology, Kanpur, and a PhD in Computer Science from University of California at Berkeley in 2007. After a postdoctoral stint at UCSD, she joined the CSE department at UC San Diego as an assistant professor in 2010. She received an NSF CAREER Award in 2013 and a Hellman Faculty Fellowship in 2012. She currently serves as an Associate Editor for the Journal of Privacy and Confidentiality and the SIMODS journal. In the past, she has served as a program co-chair for AISTATS 2019 and ICML 2019. Kamalika's research interests lie in the foundations of trustworthy machine learning -- or machine learning beyond accuracy, which includes problems such as learning from sensitive data while preserving privacy, learning under sampling bias, and in the presence of an adversary.
Zoom Participation. See announcement.