Sparse boolean matrix multiplication, data mining, and hashing
Feb 9, 2011
ABSTRACT: Boolean matrix multiplication, i.e., multiplication over the (OR, AND) semiring, is known to exhibit deep connections to a number of fundamental computational problems. However, the sparse case where most input (and/or output) entries are 0 is less well-studied. In this talk I will argue that this special case is interesting because it lies at the heart of some fundamental problems in databases and data mining. I will then present the key theoretical ideas in three recent results (ICDT '09, RANDOM '10, IPDPS '11) that look into different aspects of this problem: New complexity upper bounds in RAM and I/O models, estimation of the sparseness of a matrix product, and parallel computation. The two last results exploit hashing techniques, including a variation of cuckoo hashing that allows efficient parallel set intersections.

Speaker: Rasmus Pagh, IT University of Copenhagen.
Joint work with Rasmus Resen Amossen and Andrea Campagna.