Generative modeling, a core problem in unsupervised learning, aims at understanding data by learning a model that can generate datapoints that resemble the real-world distribution. Generative Adversarial Networks (GANs) are an increasingly popular framework that solve this by optimizing two deep networks, a "discriminator" and a "generator", in tandem.
However, this complex optimization procedure is still poorly understood. More specifically, it was not known whether equilibrium points of this system are "locally asymptotically stable" i.e., when initialized sufficiently close to an equilibrium point, does the optimization procedure converge to that point? In this work, we analyze the "gradient descent" form of GAN optimization (i.e., the setting where we simultaneously take small gradient steps in both generator and discriminator parameters). We show that even though GAN optimization does not correspond to a convex-concave game, even for simple parameterizations, under proper conditions, its equilibrium points are still locally asymptotically stable. On the other hand, we show that for the recently-proposed Wasserstein GAN (WGAN), the optimization procedure might cycle around an equilibrium point without ever converging to it. Finally, motivated by this stability analysis, we propose an additional regularization term for GAN updates, which can guarantee local stability for both the WGAN and for the traditional GAN. Our regularizer also shows practical promise in speeding up convergence and in addressing a well-known failure mode in GANs called mode collapse.
Presented in Partial Fulfillment of the CSD Speaking Skills Requirement
The AI Seminar is generously sponsored by Apple.