The $\ell_p$-Subspace Sketch Problem in Small Dimensions
March 1, 2023 (GHC 8102)

Abstract: In the $\ell_p$-subspace sketch problem, we are given an $n$ times $d$ matrix $A$ with $n > d$, and asked to build a small memory data structure $Q(A,\varepsilon)$ so that, for any query vector $x$ in $R^d$, we can output a number in $(1 \pm \varepsilon) ||Ax||_p^p$ given only $Q(A,\varepsilon)$. This problem is known to require $\Omega(d \varepsilon^{-2})$ bits of memory for $d = \Omega(\log(1/\varepsilon))$. However, for $d = o(\log(1/\varepsilon))$, no data structure lower bounds were known.

In this talk, we resolve the memory required to solve the $\ell_p$-subspace sketch problem for any constant $d$ and integer $p$, showing that it is $\tilde{\Omega}(\varepsilon^{-2(d-1)/(d+2p)})$ bits and $\tilde{O}(\varepsilon^{-2(d-1)/(d+2p)})$ words, where the $\tilde{O}()$ notation hides poly($\log(1/\varepsilon)$) factors. This shows that one can beat the $\Omega(\varepsilon^{-2})$ lower bound, which holds for $d = \Omega(\log(1/\varepsilon))$, for any constant $d$.Our bounds extend to loss functions other than the $\ell_p$-norm, and they apply to point queries for SVMs with additive error, where we show an optimal bound for every constant $d$. This is a near-quadratic improvement over the previous lower bound of Andoni et al.

Based on joint work with Yi Li and David Woodruff.