Projection Pursuit, Gaussian Scale Mixtures, and the EM Algorithm
WThe EM algorithm for fitting Gaussian mixture models is one of the
most widely-used clustering methods. Yet, surprisingly little is known
about its behavior. Under what conditions does it converge to a good
local optimum? What are good ways to initialize it, and to merge/remove
intermediate clusters? Such questions are difficult to answer with the
mathematical tools that have traditionally been applied to EM.
I will describe an alternative way of analyzing EM: by a probabilistic
analysis. This reveals, first of all, that many common methods of
initializing EM produce highly suboptimal results even in ideal
clustering scenarios. On the other hand, I'll show that a particular
variant of EM will provably recover a near-optimal clustering, provided
that the clusters are adequately separated and that their distributions
are of a certain fairly general form.
The type of cluster distributions allowed is motivated by a new result
in projection pursuit, along the lines of the (somewhat mistaken)
folklore that "almost all projections of high-dimensional data look
Gaussian".
Specifically, I'll show that for any D-dimensional distribution with
finite second moments, there is a precise sense in which almost all of
its linear projections into d < D dimensions look like a
scale-mixture of spherical Gaussians (concentric spherical Gaussians
with the same center).
The extent of this effect depends upon the ratio of d to D, and upon
the "eccentricity" of the high-dimensional distribution.
Speaker Bio
Sanjoy
Dasgupta is an Assistant Professor in the Department of
Computer Science and Engineering at UC San Diego. He received his PhD
from Berkeley, and spent two years at AT&T Research Labs before
joining UCSD.
His area of research is algorithmic statistics, with a focus on
unsupervised and minimally supervised learning. He has recently
completed a textbook, "Algorithms" (with Christos Papadimitriou and
Umesh Vazirani).