Compressive Sensing and Basis Pursuit

Since we've now seen both compressive sensing (Lecture 11) and linear programming (Lecture 17), let's do a cute little example of sparse signal recovery.

In [1]:
import numpy as np
from cvxopt import matrix, solvers
import matplotlib.pyplot as plt

%matplotlib notebook

Let's recall the $s$-sparse signal recovery problem that we saw in Lecture 11, and the algorithm for it. There is an unknown vector $x \in \mathbb{R}^n$ that we are trying to recover. All we can do are linear measurements: given a sensing vector $a \in \mathbb{R}^n$, we get back the measurement which is the inner product $\langle a,x\rangle$.

Ok, let's get started: here's an $s$-sparse vector $x$:

In [2]:
s = 15         # sparsity
n = 1000       # dimensions
k = 120        # number of measurements, i.e., rows of sensing matrix

# take the signal
z1 = np.ones(s)+np.random.random(s)
#z1 = np.random.random(s)

# pad it with noise/zeroes
#z2 = np.random.random(n-s)*0.10
z2 = np.zeros(n-s)

# and mix it all up to get the hidden vector
z = np.random.permutation(np.concatenate((z1,z2), axis=None))

And let's see what this looks like:

In [3]:
plt.subplot(2, 1, 1)

# and plot both the solution and original vector
y = np.linspace(1,n,n)
plt.plot(y,z, '+-')
plt.tight_layout()

We perform $k$ measurements, with vectors $a_1, a_2, \ldots, a_k$ (each in $\mathbb{R}^n$), and get back the measurements $b_1, \ldots, b_k$ (where $b_i = \langle a_i,x\rangle$). We want to figure out $x$. Clearly, we could set $a_i$ to be $1$ in the $i^{th}$ location and zero otherwise, which would tell us the $n$ coordinates of $x$ with $k=n$ measurements. How can we do fewer measurements, if we know that $x$ is $s$-sparse?

A little notation to make life easier: let $A$ be the $k \times n$ \emph{sensing matrix} whose rows are the $a_i$s. Hence $b = Ax$ is the vector of $k$ measurements. We construct $A$ and find out $b$ by doing the measurements, and we want to infer $x$. (Clearly $x$ is a solution to $Ax=b$, but since $k < n$ and hence this system of linear equations is under-constrained, there are many solutions. How do we find a sparse one?

The theorem mentioned in Lecture 11 was this one (due to Candes, Romberg and Tao, and Donoho): for $k = O(s \log (n/s))$ there exists a sensing matrix $A \in \mathbb{R}^{k \times n}$ so that for any $x$ that is $s$-sparse, we can find $x$ efficiently from $b = Ax$.

How to construct such a good sensing matrix $A$? They show that if we choose a Gaussian matrix (where each entry is an independent standard normal random variable, i.e., $\sim N(0,1)$) with these dimensions, then it satisfies the theorem with high probability!

In [4]:
# do the sensing
mu, sigma = 0, 1
A = np.random.normal(mu, sigma, (k, n))
b = A.dot(z)

Great. We've now got the measurements in $b$. How so we recover $x$? The algorithm is also simple: just find the solution to the linear system $Ax = b$, whose $\ell_1$ norm is the smallest. I.e., the solution with the smallest $\sum_i |x_i|$.

This is not an LP as stated (since the objectve function is not linear, it has all these absolute value signs). But it can be converted into one as follows (using the idea that $z_i \geq |x_i|$ if and only if $z_i \geq x_i$ and also $z_i \geq -x_i$.

In [5]:
# and convert the absolute values into linear constraints: solve for
# [  I -I ][x] <= [0]
# [ -I -I ][z]    [0]
Id = np.eye(n)
I1 = np.concatenate((Id,-Id), axis=1)
I2 = np.concatenate((-Id,-Id), axis=1)
Aprime = np.concatenate((I1,I2), axis=0)
bprime = np.zeros(2*n)

Aext = np.concatenate((A,np.zeros((k,n))),axis=1)
c = np.concatenate((np.zeros(n), np.ones(n)), axis=None)

# we've used numpy arrays, but now want to use cvxopt
# so convert to cvxopt format
mA = matrix(Aext)
mb = matrix(b)
mAprime = matrix(Aprime)
mbprime = matrix(bprime)
mc = matrix(c)

# solve the problem: min <c,x> s.t. Aprime x <= bprime, Ax = b
sol= solvers.lp(mc, mAprime, mbprime, mA, mb)
# and get the solution out
x = np.array(sol['x'])
     pcost       dcost       gap    pres   dres   k/t
 0:  0.0000e+00 -0.0000e+00  1e+03  6e+01  1e-16  1e+00
 1:  5.3988e+00  5.3973e+00  2e+02  8e+00  2e-16  1e-01
 2:  1.9155e+01  1.9152e+01  9e+01  4e+00  4e-16  7e-02
 3:  2.1856e+01  2.1851e+01  6e+01  3e+00  5e-16  4e-02
 4:  2.2451e+01  2.2443e+01  3e+01  1e+00  1e-15  1e-02
 5:  2.3739e+01  2.3739e+01  1e+00  6e-02  5e-16  7e-04
 6:  2.3772e+01  2.3772e+01  1e-02  6e-04  7e-16  7e-06
 7:  2.3772e+01  2.3772e+01  1e-04  6e-06  7e-16  7e-08
 8:  2.3772e+01  2.3772e+01  1e-06  6e-08  7e-16  7e-10
Optimal solution found.

Great! The LP solution is in $x$, and let's see how it did.

In [6]:
plt.subplot(2, 1, 1)

# and plot both the solution and original vector
y = np.linspace(1,n,n)
plt.plot(y,x[:n], 'o-')
plt.plot(y,z, '+-')
plt.tight_layout()

Bingo! The LP solution (the circles) has perfectly recovered the original vector (the + signs) from these $k$ measurements!


Adding a bit of noise

What if the vectors were not perfectly $s$-sparse, but had most of their mass in $s$ coordinates? The Candes, Donoho, Romberg, and Tao result also works for this noisy setting. (In fact, for a purely $s$-sparse signal one can make do with only $2s$ measurements, but we'll defer that discussion for another day.)

Let's see if this holds up, by running the sensing matrix on a noisy vector.

In [7]:
# take the signal
z1 = np.ones(s)+np.random.random(s)
#z1 = np.random.random(s)

# pad it with noise this time
z2 = np.random.random(n-s)*0.10

# and mix it all up
z = np.random.permutation(np.concatenate((z1,z2), axis=None))

# do the sensing
b = A.dot(z)

Aext = np.concatenate((A,np.zeros((k,n))),axis=1)
c = np.concatenate((np.zeros(n), np.ones(n)), axis=None)

# convert to cvxopt format
mA = matrix(Aext)
mb = matrix(b)
mAprime = matrix(Aprime)
mbprime = matrix(bprime)
mc = matrix(c)

# solve the problem: min <c,x> s.t. Aprime x <= bprime, Ax = b
sol= solvers.lp(mc, mAprime, mbprime, mA, mb)
# and get the solution out
x = np.array(sol['x'])

# eps = 1e-5
# x[np.abs(x) < eps] = 0

plt.subplot(2, 1, 1)

# and plot both the solution and original vector
y = np.linspace(1,n,n)
plt.plot(y,x[:n], 'o-')
plt.plot(y,z, '+-')
plt.tight_layout()
     pcost       dcost       gap    pres   dres   k/t
 0:  0.0000e+00 -0.0000e+00  1e+03  6e+01  1e-16  1e+00
 1:  5.4372e+00  5.4349e+00  2e+02  1e+01  2e-16  2e-01
 2:  1.5818e+01  1.5814e+01  1e+02  6e+00  5e-16  1e-01
 3:  1.9799e+01  1.9793e+01  8e+01  4e+00  7e-16  6e-02
 4:  2.2863e+01  2.2857e+01  4e+01  2e+00  5e-16  3e-02
 5:  2.4751e+01  2.4748e+01  2e+01  9e-01  6e-16  1e-02
 6:  2.5702e+01  2.5701e+01  8e+00  4e-01  4e-16  5e-03
 7:  2.6071e+01  2.6070e+01  3e+00  1e-01  6e-16  2e-03
 8:  2.6184e+01  2.6183e+01  1e+00  6e-02  1e-15  7e-04
 9:  2.6246e+01  2.6246e+01  5e-01  2e-02  1e-15  3e-04
10:  2.6269e+01  2.6269e+01  2e-01  8e-03  3e-15  9e-05
11:  2.6278e+01  2.6278e+01  4e-02  2e-03  2e-15  2e-05
12:  2.6279e+01  2.6279e+01  2e-02  7e-04  8e-15  8e-06
13:  2.6280e+01  2.6280e+01  4e-03  2e-04  9e-15  2e-06
14:  2.6280e+01  2.6280e+01  7e-04  3e-05  2e-14  4e-07
15:  2.6280e+01  2.6280e+01  9e-06  4e-07  3e-14  4e-09
Terminated (singular KKT matrix).

Pretty good, huh? The orange crosses are the original signal, the blue dots are the LP solution we found. So we've done pretty well at figuring out the large signal coordinates, but of course the noise may be in weird locations.

We could try to trim away the small coordinates of the LP solution to get something that looks even better.

In [8]:
eps = 0.5
x[np.abs(x) < eps] = 0

plt.subplot(2, 1, 1)

# and plot both the solution and original vector
y = np.linspace(1,n,n)
plt.plot(y,x[:n], 'o-')
plt.plot(y,z, '+-')
plt.tight_layout()

Not perfect, but not bad. (And we can do better if we increase $k$.)

That's it for today.

If you play with the numbers (specifically keeping $n$ and $s$ fixed and then reducing $k$), you will see that the recovery remains pretty good until some level, and then sharply plummets.