Joint Artificial Intelligence / Theory Seminar

  • Ph.D. Student
  • Department of Computer Science
  • The University of British Columbia

Near-optimal sample complexity bounds for learning mixtures of Gaussians

Estimating distributions from observed data is a fundamental task in statistics that has been studied for over a century. We consider such a problem where the distribution is a mixture of k Gaussians in R^d. The objective is density estimation: given i.i.d. samples from the (unknown) distribution, produce a distribution whose total variation distance from the unknown distribution is at most epsilon. We prove that Theta(kd^2/epsilon^2) samples are necessary and sufficient for this task, suppressing logarithmic factors. This improves both the known upper bound and lower bound for this problem.

The AI Seminar is generously by Apple.

For More Information, Please Contact: