In the early days of kernel machines research, the "kernel trick" was considered a useful way of constructing nonlinear algorithms from linear ones. More recently, however, it has become clear that a potentially more far reaching use of kernels is as a linear way of dealing with higher order statistics, by embedding distributions in a suitable reproducing kernel Hilbert space (RKHS). Notably, unlike the straightforward expansion into higher order moments or a conventional characteristic function approach, embedding in RKHSs provides a painless, tractable way of embedding distributions.
This line of reasoning leads naturally to the questions: what does it mean to embed a distribution in an RKHS? When is this embedding injective (and thus, when do different distributions have unique mappings)? What implications are there for learning algorithms that make use of these embeddings? This talk aims at answering these questions.
Topics will include:
- Introduction to distribution embeddings; Maximum Mean Discrepancy (MMD) as metric on distributions
- MMD as a measure of statistical dependence, and the Hilbert-Schmidt Independence Criterion (HSIC)
- Characteristic kernels and injective embeddings in reproducing kernel Hilbert spaces
- Applications of MMD to feature selection and unsupervised taxonomy discovery
Venue, Date, and Time
Venue: Wean Hall 4615A
Date: Monday, March 23, 2009
Time: 12:00 noon