Machine Learning and Friends at Carnegie Mellon

Abstract

Recent research has demonstrated that sparsity is a powerful technique in signal reconstruction and in statistical inference. Recent work shows that $\ell_1$-regularized least squares regression can accurately estimate a sparse model from n noisy samples in $p$ dimensions, even if p is much larger than n. My talk focuses on studying the role of sparsity in high dimensional regression when the original noisy samples are compressed, and on structure estimation in Gaussian graphical models when the graphs evolve over time.

In high-dimensional regression, the sparse object is a vector \beta in Y = X \beta + \epsilon, where X is n by p matrix such that $n << p$, $\beta \in R^{p}$ and $\epsilon \in R^n$ consists of i.i.d. random noise. In the classic setting, this problem is ill-imposed for p > n even for the case when $\epsilon =0$. However, when the vector \beta is sparse, one can recover an empirical $\hat \beta$ that is consistent in terms of its support with true $\beta$. In joint work with John Lafferty and Larry Wasserman, we studied the regression problem under the setting that the original n input variables are compressed by a random Gaussian ensemble to m examples in $p$ dimensions, where m << n or p. A primary motivation for this compression procedure is to anonymize the data and preserve privacy by revealing little information about the original data. We established sufficient mutual incoherence conditions on X, under which a sparse linear model can be successfully recovered from the compressed data. We characterized the number of random projections that are required for $\ell_1$-regularized compressed regression to identify the nonzero coefficients in the true model with probability approaching one. In addition, we showed that $\ell_1$-regularized compressed regression asymptotically predicts as well as an oracle linear model, a property called ``persistence''. Finally, we established upper bounds on the mutual information between the compressed and uncompressed data that decay to zero.

Undirected graphs are often used to describe high dimensional distributions. Under sparsity conditions, the graph can be estimated using $L_1$ penalization methods. However, current methods assume that the data are independent and identically distributed. If the distribution---and hence the graph--- evolves over time then the data are not longer identically distributed. In the second part of the talk, I show how to estimate the sequence of graphs for non-identically distributed data and establish some theoretical results on convergence rate in the predictive risks and the Frobenius norm of the inverse covariance matrix. This is joint work with John Lafferty and Larry Wasserman.

Bio

Venue, Date, and Time

Venue: NSH 1507

Date: Monday, February 25

Time: 12:00 noon

Abstract

Bio

Venue, Date, and Time

Slides