Features, kernels, and similarity functions

Avrim Blum


  Given a new learning problem, one of the first things you need to do is figure out what features you want to use. Alternatively, there has been substantial work on kernel functions which provide implicit feature spaces, but then how do you pick the right kernel? In this talk I will survey some theoretical results that can provide some help or at least guidance for these tasks. In particular, I will talk about:
  • Algorithms designed to handle large feature spaces when it is expected that only a small number of the features will actually be useful (so you can pile a lot on when you don't know much about the domain).
  • Kernel functions. Can theory provide some guidance into selecting or designing a kernel function in terms of natural properties of your domain?
  • Combining the above. Can we use kernels to generate explicit features?
[in addition to the survey nature, part of this talk will sneak in some work joint with Nina Balcan]

