Tuesday, Nov 5th, 2019. 12:00 PM. NSH 3305

Vaishnavh Nagarajan -- Uniform Convergence May Be Unable to Explain Generalization in Deep Learning

Abstract: In this talk, I will present our work that casts doubt on the ongoing pursuit of using uniform convergence to explain generalization in deep learning.

Over the last couple of years, research in deep learning theory has focused on developing newer and more refined generalization bounds (using Rademacher complexity, covering numbers, PAC-Bayes etc.,) to help us understand why overparameterized deep networks generalize well. Although these bounds are quite different on the surface, essentially, they are 'implementations' of a single learning-theoretic technique called uniform convergence.

While it is well-known that many of these existing bounds are numerically large, through a variety of experiments, we first bring to light another crucial and more concerning aspect of these bounds: in practice, these bounds can increase with the dataset size. Guided by these observations, we then present specific scenarios where uniform convergence provably fails to explain generalization in deep learning. That is, in these scenarios, even though a deep network trained by stochastic gradient descent (SGD) generalizes well, any uniform convergence bound would be vacuous, however carefully it is applied.

Through our work, we call for going beyond uniform convergence to explain generalization in deep learning.

This is joint work with Zico Kolter.

AI Seminar sponsored by Apple

Tuesday, Nov 5th, 2019. 12:00 PM. NSH 3305

Vaishnavh Nagarajan -- Uniform Convergence May Be Unable to Explain Generalization in Deep Learning