Abstract:

In this talk, I present a new setting for testing properties of distributions while receiving samples from several distributions, but few samples per distribution. The main motivation for considering this setting is that it captures data collection in the real world. I explain a brief description of our testers for the following problems in this setting when given samples from s distributions, p_1, p_2, . . . , p_s:

(1) Uniformity Testing: Testing whether all the p_i’s are uniform or eps-far from being uniform in \ell_1-distance

(2) Identity Testing: Testing whether all the p_i’s are equal to an explicitly given distribution q or eps-far from q in \ell_1-distance, and

(3) Closeness Testing: Testing whether all the p_i’s are equal to a distribution q which we have sample access to, or eps-far from q in \ell_1-distance. By assuming an additional natural condition about the source distributions, we provide sample optimal testers for all of these problems.

Joint work with Sandeep Silwal