In dictionary learning, also known as sparse coding, we are given samples 'y' of the form y=Ax, where x is an unknown random sparse vector and A is an unknown dictionary matrix (n \times m) (usually m>n, which is the overcomplete case). The goal is to learn A and x. This problem has been studied in neuroscience, machine learning, computer vision, and image processing. In practice it is solved by heuristic algorithms and provable algorithms seemed hard to find.

Recently, provable algorithms were found that work if the unknown feature vector x is \sqrt{n}-sparse or even sparser. Spielman et al. did this for dictionaries when m is at most n; Arora et al. gave an algorithm for overcomplete (m>n) and incoherent matrices A; and Agarwal et al. handled a similar case but with slightly weaker guarantees. This raises the problem of designing provable algorithms that allow sparsity >>\sqrt{n} in the hidden vector x. The current paper designs algorithms that allow sparsity up to n/poly(logn). It works for a class of matrices where features are "individually recoverable", a new notion we identify, that may motivate further work. The algorithm runs in quasi-polynomial time because we use limited enumeration.

(Joint work with Sanjeev Arora, Rong Ge and Tengyu Ma)