Discriminative Cluster Analysis

Teaser

People

Abstract

Clustering is one of the most widely used statistical tools for data analysis. Among all existing clustering techniques, k-means is a very popular method due to its ease of programming and its good trade-off between achieved performance and computational complexity. However, k-means is prone to local minima problems and does not scale well with high dimensional data sets. A common approach to clustering high dimensional data is to project in the space spanned by the principal components (PC). However, the space of PCs does not necessarily improve the separability of the clusters. In this paper, we propose Discriminative Cluster Analysis (DCA) that clusters data in a low dimensional discriminative that encourages cluster separability. DCA simultaneously performs dimensionality reduction and clustering, improving efficiency and cluster performance in comparison with generative approaches (e.g. PC). We exemplify the benefits of DCA versus traditional PCA+k-means clustering through several synthetic and real examples. Additionally, we provide connections with other dimensionality reduction and clustering techniques such as spectral graph methods and linear discriminant analysis.

Citation

Paper thumbnail Fernando de la Torre and Takeo Kanade,
"Discriminative Cluster Analysis",
Theory and Novel Applications of Machine Learning. M. Er and Y. Zhou (Eds). January 2009.
[PDF] [Bibtex]

Acknowledgements and Funding

This research is supported by: MH R01 51435 from the National Institute of Mental Health, N000140010915 from the Naval Research Laboratory, the Department of the Interior National Business Center contract no. NBCHD030010, and SRI International subcontract no. 03-000211.

Copyright notice

Human Sensning Lab