Deciding high-dimensional sub-Gaussian-ness in polynomial time
December 3, 2025 (GHC 8102)

Given samples from a probability distribution, can efficient

algorithms tell whether the distribution has heavy or light tails?

This problem is at the core of algorithmic statistics, where

algorithms for deciding heavy-versus-light tailed-ness are key

subroutines for clustering, learning in the presence of adversarial

outliers, and more. It is easy in one dimension but challenging in

high dimensions, where a distribution can have light tails in some

directions and heavy ones in others -- detecting a single direction

with a heavy tail hiding in an otherwise light-tailed distribution can

seemingly require brute-force search. In this talk, I describe a

family of efficient algorithms for deciding whether a high-dimensional

probability distribution has sub-Gaussian tails, with applications to

a wide range of high-dimensional learning tasks using sub-Gaussian

data.

Based on joint work with Ilias Diakonikolas, Ankit Pensia, and Stefan

Tiegel, appearing in STOC 2025