# Theory Lunch Seminar

- Remote Access Enabled - Zoom
- Virtual Presentation

- ARVIND V. MAHANKALI
- Computer Science Undergraduate Student
- School of Computer Science
- Carnegie Mellon University

## Optimal \ell_1 Column Subset Selection and a Faster PTAS for \ell_1 Low-Rank Approximation

In this talk we will discuss new algorithms for entrywise \ell_1-norm low-rank approximation, a robust variant of the well-studied Frobenius norm low-rank approximation problem. This version of the low-rank approximation problem is useful on a variety of machine learning/computer vision tasks. Unlike Frobenius norm low-rank approximation, \ell_1 low-rank approximation is known to be NP-hard due to Gillis and Vavasis, and obtaining a (1 + \eps)-approximation requires 2^{\Omega(1/\eps)} running time assuming the Exponential-Time Hypothesis, as shown by Song, Woodruff and Zhong. Formally, the problem can be written as follows: given a matrix A \in \R^{n \times d} and a positive integer k, the goal is to find a rank-k matrix A_k minimizing \|A - A_k\|_1 (the sum of the absolute values of the entries of A - A_k). Many existing algorithms for this problem are bicriteria algorithms, meaning they output A_k having rank slightly larger than k --- our algorithms are bicriteria as well.

First, we will discuss a new polynomial-time column subset selection-based algorithm for \ell_1 low-rank approximation. Column subset selection-based algorithms output a matrix A_k such that all of the columns of A_k are in the span of a small subset of the columns of A. Our column subset selection-based algorithm obtains an approximation factor of \widetilde{O}(k^{1/2}) and a bicriteria rank of \widetilde{O}(k), improving upon the previous best column subset selection-based algorithm which obtains an approximation factor of \widetilde{O}(k). Our algorithm matches a prior lower bound for \ell_1 column subset selection-based algorithms due to Song, Woodruff and Zhong, which applies to any \poly(k)-sized subset of columns. We will discuss how the same algorithm can be used to obtain tight upper and lower bounds for column subset selection-based algorithms for entrywise \ell_p-norm low-rank approximation for p \in (1, 2).

In the second half of the talk, we will discuss constant-factor and (1 + \eps)-approximation algorithms for \ell_1 and \ell_p low-rank approximation for p \in (1, 2). Previously known O(1)- and (1 + \eps)-approximation algorithms for \ell_1 and \ell_p low-rank approximation required n^{\poly(k)} and n^{\poly(k/\eps)} running time respectively, and outputted matrices of rank at most 3k. We build on these algorithms to obtain a (1 + \eps)-approximation algorithm which outputs a matrix of rank at most 3k, with running time at most 2^{\poly(k/\eps)} + \poly(nd) - our result implies that the problem of \ell_1 low-rank approximation is fixed-parameter tractable. We will also discuss some hardness results that suggest that it is difficult to significantly improve the running time further, if an O(1)- or (1 + \eps)-approximation algorithm is desired.

*Joint work with David P. Woodruff. *

*Zoom Participation Enabled. See announcement.*