I'm not actively updating this page. Please visit Publications for more recent activities. Thanks!

Adaptive Nonparametric Regression via Trend Filtering

Trend filtering is a new class of nonparametric regression methods with the property of being locally adaptive. In the univariate case, it behaves exactly the same as the locally adaptive smoothing splines [Mammen and Van de Geer, 1997], i.e., minimax optimal for the classes of functions with hetereogeneous amoothness; computationally it is way faster than its spline cousin.

Graph trend filtering (GTF) is designed to filter signals obsesrved on the vertices of a graph with hetereogeneous smoothness. The above illustrations are examples of running GTF with k=0, k=1 and k=2, on 2D lattice grids, which induces piecewise constant, piecewise linear and piecewise quadratic structures on the filtered signal over a graph.

The figure above shows an application of a variant of GTF to detecting events in Manhattan using Taxi data, and it produces a detection that is significantly sharper and much less false positives than Laplacian smoothing. Code is becoming available on GitHub project: glmgen.

Theoretical Foundation for Noisy Sparse Subspace Clustering

Left:Noiseless SSC illustration. Right:Noisy SSC illustration.

This is an geometric condition for the exact separation recovery of Lasso-SSC when data is noisy, or arbitrarily corrupted as long as the Frobenious norm of the additive error is bounded. We showed that the geometric gap of incoherence and inradius, which underpins the noiseless recovery guarantee in Soltanolkotabi and Candes's geometric analysis on SSC , is essentially proportional to the margin of error lassoSSC is able to tolerate. The results justified SSC's state-of-the-art performance in a number of real applications and reveal insight into what kind of problem can be robustly solved by Lasso-SSC.

In the JMLR extension of the work, we improved the bound and now we know that in the deterministic setting one can tolerate subgaussian noise of magnitude in the order of , where is the ambient dimension of the datapoint.

Matrix factorization with missing data

Matrix factorization is a fundamental building blocks of many popular machine learning algorithms for regression, dimension reduction, factor analysis, clustering and etc. While this approach is widely adopted and empirically sound, it is rather ungrounded in terms of theoretical analysis.


Dr. Xu and I produced a stability analysis to the formulation of factorization methods. The theory naturally explains the long observed robustness of factorization methods in collaborative recommendation system. To start from here, we hope to progress towards a comprehensive theory in the behaviors of factorization methods then design provably robust algorithms for collaborative filtering.

An actively updating blog of matrix factorization literature is here.

Practical matrix completion and matrix recovery

The recent theory of Emmanuel Candes, Terrence Tao, Ma Yi et. al about the tractable convex optimization to solve matrix completion and recovery are absolutely beautiful. Nevertheless, the real data matrix usually does not satisfy the assumptions which include incoherence, random support (of either observation mask or corruptions) and zero-noise. As an illustration, see the three feature trajectories of Oxford dinosaur sequence below:

Left: Input data with incomplete trajectories; Middle: Results of nuclear-norm based convex matrix completion; Right: Results of PARSuMi, our non-convex matrix completion algorithm.

Common practical challenges are:

  1. Diagonal-band-shaped data matrix
  2. Deterministic support of sampling or corruption
  3. Near degenerate subspace (diminishing singular values)
  4. Co-existence of noise, corruptions and missing data.

I hope to design algorithms (convex or non-convex) that works in practice by introducing additional constraints specific to the problem, such as exact rank, orthogonality, box constraint, non-negativity and smoothness.

Subspace clustering and motion segmentation

The intensive research on low-rank matrix recent years has come to a point that its subspace structure and behaviors are (almost) thoroughly understood. People have started to look at more complicated structure such as a union of multiple subspaces.

A motivating application is the segmentation of feature trajectories of multiple rigidly moving object. Such motion segmentation task is equivalent to subspace clustering because each motion is at most rank-4. The seminal SSC paper of Vidal et. al suggests that the key of segmentation is the so-called "self-expressiveness", namely, the vectors should be most easily represented by vectors from the same subspace group.

The theory of subspace clustering sees some great progress lately, featuring Soltanolkotabi and Candes' geometric analysis on SSC, Liu Guangcan's analysis of LRR and Elhamifar and Vidal's new PAMI version SSC.

However, there are still some persisting problems/open questions:

  1. General missing data, partial observations
  2. Random sparse corruptions
  3. Overlapping subspace
  4. Graph connectivity
  5. Stability results under noisy scenario

A recent paper titled "High-rank matrix completion" attempts to show that by assuming union-of-subspace model, the missing data of a matrix can be filled-in even if the matrix is full-rank! This is a head start towards a solution to Problem 1, but their local approach largely relies on assumption of the overwhelming number of column samples, which is by no means realistic.

Change detection for practical surveilance camera

Segmentation of meaningful foreground from background is a first step in many CV applications. When the camera is static, such detection is rather simple. In practice, however, there are graduate change of illumination, dynamic/stochastic background motion (tree branches, waves, rain, fog and etc), jittering of camera which causes severe motion blur. All these make the problem not that trivial at first glance.

We designed an algorithm that solves jittering, motion blur, and image inpainting simultaneously with a key step being 'matrix completion', the following is an illustration of its results on 'traffic' sequence:

From Left to Right: 1. aligned jittering/blurry input; 2. recovered background; 3. recovered foreground; 4. missing data used for matrix completion (detected by a run of RPCA on small images). Note that motion blur is removed and black band on top of the 3rd row (caused by image alignment) is recovered perfectly.