Generalization in Clustering with Unobserved Features
by Eyal Krupka and Naftali Tishby
The authors argue that when objects are characterized by many attributes, clustering them on the basis of a relatively small random subset of these attributes can capture information on the unobserved attributes as well. Moreover, they show that under mild technical conditions, clustering the objects on the basis of such a random subset performs almost as well as clustering with the full attribute set. They prove a finite sample generalization theorem for this novel learning scheme that extends analogous results from the supervised learning setting. The scheme is demonstrated for collaborative filtering of users with movies rating as attributes.
Robust design of biological experiments
by P. Flaherty, M. I. Jordan and A. P. Arkin
The authors address the problem of robust, computationally-efficient design of biological experiments. Classical optimal experiment design methods have not been widely adopted in biological practice, in part because the resulting designs can be very brittle if the nominal parameter estimates for the model are poor, and in part because of computational constraints. The paper presents a method for robust experiment design based on a semidefinite programming relaxation. An application of this method to the design of experiments for a complex calcium signal transduction pathway is given, where the parameter estimates obtained from the robust design are found to be better than those obtained from an "optimal" design.
A Bayesian Spatial Scan Statistic
by Daniel B. Neill, Andrew W. Moore, and Gregory F. Cooper
We propose a new Bayesian method for spatial cluster detection, the "Bayesian spatial scan statistic," and compare this method to the standard (frequentist) scan statistic approach. We demonstrate that the Bayesian statistic has several advantages over the frequentist approach, including increased power to detect clusters and (since randomization testing is unnecessary) much faster runtime. We evaluate the Bayesian and frequentist methods on the task of prospective disease surveillance: detecting spatial clusters of disease cases resulting from emerging disease outbreaks. We demonstrate that our Bayesian methods are successful in rapidly detecting outbreaks while keeping the number of false positives low.
Back to the Main Page
Pradeep Ravikumar Last modified: Fri Sep 9 12:02:08 EDT 2005