Introduction

The main component of this project was the implementation (in MATLAB) of a set of functions for warping and blending images. Used alone, the warping functions allow an image to be warped into a new perspective creating interesting transforms such as "image rectification" (shown below). Used in conjunction with the blending functions, they allow the creation of photo mosaics such as panoramas.

Homography Recovery & Image Rectification

The type of transform used by the warping operation is known as a "homography" and is represented as a 3x3 matrix with eight degrees of freedom. Solving for the eight components of the transform thus requires a minimum of four (X,Y) points to construct a set of linear equations relating the pre-transform and post-transform states.

If the transform maps the corners of an object we know to be square onto an appropriately-sized square in the image plane, it is possible (to a certain extent) to simulate the effect of looking directly at the plane of the selected object. This technique is called "image rectification", and was employed by Criminisi, Kemp, and Zisserman (2005) to study floor and ceiling patterns in various pieces of artwork. Inspired by that paper, one example of image rectification completed by my project is a study of the floor pattern in Jan van Eyck's Annunciation (1434).

As a second example, a photograph I took of the Carnegie Music Hall is rectified such that the camera appears to be directly facing the side wall:

Photo Mosaics

Another application of image warping using homographies is the stitching of multiple images into photo mosaics. Below are three source image sets I used to generate photo mosaics for this project:

Feature Matching & Auto-Stitching

In the case of mosaic "stitching", the points used to solve for the correct transform are the locations of cooresponding features in a pair of images. It is advantageous in this application to specify more than four control points and to solve for the required transform parameters using a least-squares solution (easily accomplished through MATLAB's mldivide operator).

The selection of control points for stitching photo mosaics may be automated with the four-stage process illustrated below. In the first stage, potential control points are selected using the Harris Corner Detector.

In the second stage, the large number of points returned by the Harris algorithm is reduced to ~500 points. While the Harris algorithm returns a "strength" associated with each point, simply selecting the strongest points may return dense clusters of points in small areas of the image. This, in turn, may leave too few points in the overlap area to allow for an effective homography calculation between the image and its neighbor. Instead, Brown et. al. (2005) describe a technique called "Adaptive Non-Maximal Suppression" which selects points based on the radius in which they are the strongest point rather than their absolute strength. The spatial distribution of points selected in this manner is quite even, as shown below:

In the third stage of the algorithm, feature "descriptors" (in this case, 8x8 patches sampled from the area surrounding each interest point) are extracted from the image and matched with one another. As proposed by Lowe, matches are accepted or rejected based not on their error alone, but rather on the ratio of the errors of the best and second-best matches. The images below depict unmatched points from the ANMS stage in red; matched points are depicted in blue.

In the fourth and final stage, a technique known as "Random Sample Consensus" ("RANSAC") is used to reduce the number of erroneous feature matches (e.g. matching different locations in the sky) returned by the stage above. The RANSAC technique works by iteratively selecting four point pairs at random, using them to compute an estimated homography, and then determining how many of the remaining point pairs "agree" with the estimated homography. The points which comprise the largest "agreement" set found after many iterations are assumed to be correct and thus are used to calculate the final homography via the least-squares solution. The images below depict the matched points from the algorithm's third stage in blue; the points selected by RANSAC for the final homography calculation are depicted in green.

Once the images have been warped into a matching projection, it is necessary to composite them into the final mosaic. My implementation uses a two-band alpha blending technique as proposed by Brown & Lowe (2003), with center-weighting to blend low frequencies "smoothly" and a binary alpha channel to blend high frequencies.

Results

These are the completed mosaics produced by the algorithm described above:

"Bells & Whistles" - Cylindrical Projection

The perspective warping produced by a homography is not suitable for wide panoramas because of the increasing distortion it yields as the edges of the panorama are extended further away from the center. One solution to this problem is to use a cylindrical projection in which the photographs are warped onto a cylinder with radius equal to the camera's focal length prior to being stitched. In theory, this approach both solves the perspective distortion problem for wide panoramas and reduces the stitching transformation required from a perspective warp to a simple translation.

In practice, however, although the cylindrical projection itself is fairly straightforward to implement, it can be somewhat tricky to accurately determine the focal length of the camera in pixels. The images used in this project were taken with a Canon PowerShot G9, a camera model which is known to incorrectly report certain pieces of EXIF data which might be used for focal length estimation. Working instead from the physical dimensions of the G9's sensor (as well as those EXIF fields which are correctly reported), the focal length used for the images above was eventually estimated to be 7789.54 pixels.

Versions of the three mosaics produced using cylindrical projection are shown below: