Nina Balcan
Title: A Theory of Similarity Functions for Learning and Clustering
Abstract:
Kernel methods have proven to be extremely powerful tools in machine
learning. They perform well in many applications, and there is also a
well-developed theory of sufficient conditions for a kernel to be useful
for a given learning problem. However, while a kernel can be thought of as
just a pairwise similarity function that satisfies additional mathematical
properties, this theory requires viewing kernels as implicit (and often
difficult to characterize) maps into high-dimensional spaces. In this work
we develop an alternative theory of learning with more general similarity
functions, which requires neither reference to implicit spaces, nor the
function to be positive semi-definite. Our results strictly generalize the
standard theory, and any good kernel function under the usual definition
can be shown to also be a good similarity function under our definition.
We then show how our framework can also be applied to clustering:
multi-way classification from purely unlabeled data. In particular, using
this perspective we develop a new model that directly addresses the
fundamental question of what kind of information a clustering algorithm
needs in order to produce a highly accurate partition of the data. Our
work can be viewed as an approach to defining a PAC model for clustering
with non-interactive feedback.
This talk is based on work joint with Avrim Blum and Santosh Vempala.