Machine Learning / Duolingo Seminar
- Newell-Simon 4305 and Zoom
- In Person and Virtual Presentation - ET
- PREETUM NAKKIRAN
- Postdoctoral Fellow
- NSF/Simons Collaboration on the Theoretical Foundations of Deep Learning
- University of California, San Diego
The Deep Bootstrap Framework: Rethinking Generalization to Understand Deep Learning
We propose a new framework for understanding generalization in deep learning. In the first part, I will elaborate on what "understanding generalization" means, and describe my scientific methodology (which involves conjectures and experiments, in place of theorems and proofs).
In the second part, I will describe our results. The core idea in our framework is to couple the Real World, where samples are re-used in multiple epochs, to an "Ideal World", where we see fresh samples in every batch. It turns out that in practice, the gap between these two worlds is often universally small. This reduces the problem of *generalization* in offline learning to the problem of *optimization* in online learning. Our results imply that a good deep learning method is one which (1) optimizes quickly on the population loss, and (2) optimizes slowly on the empirical loss.
We can use this to gain a new optimization-perspective on many phenomena and design choices in deep learning. For example, CNNs often generalize better than MLPs on image distributions in the Real World, but this is "because" they optimize faster on the population loss in the Ideal World. We can similarly give optimization-perspectives for the effect of pre-training, data-augmentation, regularization, learning rate, etc. We thus hope our framework encourages researchers to consider generalization through the (perhaps simpler) lens of optimization.
Based on joint work with Behnam Neyshabur and Hanie Sedghi.
Preetum Nakkiran is a postdoc at UCSD, hosted by Misha Belkin, and part of the NSF/Simons Collaboration on the Theoretical Foundations of Deep Learning. He recently completed his PhD at Harvard, advised by Boaz Barak and Madhu Sudan. He did his undergraduate work in EECS at UC Berkeley. He has also spent time in industry research labs, at OpenAI and Google.
Preetum's research interest is understanding machine learning, especially deep learning, through theory and experiments. In the past, he has results on fine-grained generalization, double-descent, and adversarial-examples. In the distant past, he has worked on error-correcting codes and compression. He is the past recipient of NSF GRFP and a Google PhD Fellowship.
The ML Seminar is generously sponsored by Duolingo.
In Person and Zoom Participation. See announcement.