The Johnson-Lindenstrauss Transform Itself
Preserves Differential Privacy

Jeremiah Blocki, Avrim Blum, Anupam Datta and Or Sheffet
[Paper]

Abstract

This paper proves that an ``old dog'', namely -- the classical Johnson-Lindenstrauss transform, ``performs new tricks'' -- it gives a novel way of preserving differential privacy. We show that if we take two databases, D and D', such that (i) D'-D is a rank-1 matrix of bounded norm and (ii) all singular values of D and D' are sufficiently large, then multiplying either D or D' with a vector of iid normal Gaussians yields two statistically close distributions in the sense of differential privacy. Furthermore, a small, deterministic and public alteration of the input is enough to assert that all singular values of D are large.
We apply the Johnson-Lindenstrauss transform to the task of approximating cut-queries: the number of edges crossing a (S,\bar S)-cut in a graph. We show that the JL transform allows us to publish a sanitized graph that preserves edge differential privacy (where two graphs are neighbors if they differ on a single edge) while adding only O(|S|/\epsilon) random noise to any given query (w.h.p). Comparing the additive noise of our algorithm to existing algorithms for answering cut-queries in a differentially private manner, we outperform all others on small cuts (|S| = o(n)).
We also apply our technique to the task of estimating the variance of a given matrix in any given direction. The JL transform allows us to publish a sanitized covariance matrix that preserves differential privacy w.r.t bounded changes (each row in the matrix can change by at most a norm-1 vector) while adding random noise of magnitude independent of the size of the matrix (w.h.p). In contrast, existing algorithms introduce an error which depends on the matrix dimensions.


Errata

A few corrections of previous versions:
  1. \epsilon was unnecessarily squared in the introduction.
  2. Added a footnote to the introduction.
  3. The final step in the proof of Claim 4.3 (or Claim B.1) was revised and it's now better phrased.

Personal Comments

Though it is not explicitly written in the paper, it is straight-forward to see from the proof of Theorem 4.1 (and Claim 4.3) that our technique give privacy guarantess under instance dependent noise. Specifically, if the input matrix A has the property that all its singular values are greater than some w = \tilde O(1/\epsilon_0) and you are willing to expose this fact to the world, you can simply apply the JL transform to A itself and release the output matrix. In such a case, the JL theorem applies to ||MAx||^2, allowing us to answer the query x with only multiplicative error. Furthermore, if all the singular values are greater than (say) 10w, you can release the least singular value + Laplace noise so no privacy is compromised.