Introduction

For the first checkpoint of project 4, we were supposed to Then, the whole process of feature selection and correspondence matching was done automatically in the second part of the project. I have also implemented some extras, like panorama discovery and doing stitching on a cylinder.

Picture Taking

The pictures were taken around campus using a Pentax K-x.

Homography Estimation

Given two sets of corresponding points, x1, x2, ..., xn and X1, X2, ..., Xn we are intersted in finding a homography that maps each xi to the corresponding Xi. But as usually the system we get is overconstrained, we are rather interested in the least squares solution. The set of of n correspondences results in a system of 2n equations, and this can be easily approximated in MATLAB using the backslash operator (i.e. multiplying by the pseudoinverse).

Warping

I implemented inverse warping: we go over the resulting picture and take the inverse of each point. Then, all of this data is fed to MATLAB's interp2 which then interpolates the data using the intensities in the original image.

Image Rectification

Once we have the above algorithms implemented, we can perform image rectification. We pick four points such that they will become a square when we look from the perspective we are intested in. Then, we compute a homography to a square with known coordinates and then simply warp. Here are some results with this technique:

Original (scaled down to 500px width) Rectified
Original (scaled down to 500px width) Rectified
Original (scaled down to 500px width) Rectified

Creating Mosaics

The mosaics are created in an iterative fashion. If the images are 1,2, ..., N then at the i-th iteration we join what is the result of merging 1 through i and i+1. At each step we define a set of correspondence points from which we estimate the homography. As the boundaries where the images meet are in general not going to be smooth, I have implemented feathering. How this is done, is that for one of the pictures that is merged we define a radial feathering mask that is 0 at the centre of the image and grows to 1 radially. This makes the transitions smoother. Here are some results.

Original Images (click for larger version)
Stitched
Original Images (click for larger version)
Stitched
Original Images (click for larger version)
Stitched

Automatic Stitching

First, the features are selected using the Harris corner detector. I have used the implementation provided with the homework. This is an example of the output from the Harris detector.
Harris detector output.

As you can see, there are too many points selected in the picture. To make the correspondence process easier, we select the strongest 500 points (or less), but we also take into consideration their distribution - we want to have them all over the picture. For this, we use Adaptive Non-Maximal Suppression (ANMS), as described in the provided paper by Brown et al. This is the result of applying this algorithm on the above picture.
Output of ANMS.

Once we have these points, we are ready to select the actual feature. At each point we center a 40x40 window (or smaller, if it is near the borders of the picture). Then we resize this window to 8x8 and put all of the entries in a 64-dimensional vector. This vector is bias/gain normalized by subtracting the mean and dividing by the variance. These are some pictures of example extracted windows (they are shown before reduction to 8x8, so they are easier to identify).
Extracted features.

Now that we have the 64 dimensional features, we have to match the features in one picture with the features in the other. This is done as described in the paper. We iterate over all pixels and greedily choose the best neighbor. If the ratio of best_neighbor/second_best is above some threshold, then we add that pair. Unfortunately, some of the pairs will be outliers. Simply using a least squares approximation will not work, as the outliers will have a big effect on the estimated parameters. Thus, we use a RANSAC algorithm which maximizes the number of inliers, rather than a general target function. In a loop, 10 000 times (a constant chosen empirically), we choose randomly 4 corresponding pairs from which we estimate the homography. Then, we count how many points are inliers when this homography is used. This turned out to be very robust and gave some really good results. Below some examples are shown.
Ran on the same data as the one obtained manually.
Ran on the same data as the one obtained manually.
Ran on the same data as the one obtained manually.
Original Images (click for larger version)
Stitched

Bells & Whistles

Panorama Discovery

Sometimes, when we have a large dataset of pictures, we would like to automate the whole process even more so that panoramas are discovered automatically and stitched - the user has to do zero work. How I do this is as follows. I create an undirected weighted graph where each node corresponds to a different picture. The edges are weighted, and the weights corresponds to the fraction of points that were inliers when performing RANSAC. Some of the weak edges are discarded, as these are probably not related pictures. Then, the connected components are extracted. Each connected component should (hopefully) correspond to a different panorama. Now, for each component I do a breadth first search and I join the images with the current panorama as they are discovered. I put all the files from the above panoramas in the same folder and I got the same panoramas as above.

Blending & Compositing

We can also easily play with pictures, composing them in various ways. These are some results.
Original Images




Composed

Cylindrical Mapping

Instead of doing planar mapping, we can first map the images to a cylinder and then estimate a translation between them. For cylinder mapping, I have used the equations in Szeliski's book. The map estimation was done using a very simple algorithm (similar to RANSAC), where we go over all feature points, we compute the translation between the point and its 'image' (two points define a translation) and we choose the translation with most inliers. I did not manage to get the applet to work - it is either hanging on 'Loading Image', or it is not able to find the IVR file - I tried both relative and absolute addressing. Below I show a result of this procedure, but note that it is not completely 360 degrees and it was not taken with a tripod (unfortunately, I don't have one :( ). Also, I picked a value for the focal length that made the result look good.