Project 4: Auto Stitching Photo Mosaics

William Wedler
Computational Photography
15-463 Fall 2007

Contents

Overview

Part 1 of the stitching and photo mosaics project includes determining the Homography that maps one projection plane to another and to stitch together panoramic scenes based on manual input of correspondence points.

Part 2 involves automating the mosaic stitching process by computing correspondence points between overlapping images. These computer correspondence points are then used in place of the manually selected points in part 1 to stitch together panorama scenes.

Process

Part 1: Homography Calculation and Mosaic Stitching

The first part of the assignment involves the computation and application of a Homography image transformation matrix. I wrote a function computeH that solves for the Homography transformation between two sets of points. I then used computeH to manipulate some sample photos, shown in the results section. Next, the Homography was put to use in stitching together a panorama scene. I used it to compose three Pittsburgh scenes.

Image Projection

A projective transformation can give the impression that a photograph was taken at a different point of view. This is useful for many things and in this project, it is used to help stitch photos together to form panoramas.

The Homography transformation matrix is used for projective transformations and has 8 degrees of freedom, requiring 4 correspondences. However, when data is noisy, more correspondences can be used to determine the Homography.

Least Squares Numerical Approximation

The Homography is found by using numerical methods to minimize the error between the Homography and the correspondence points between two images.

The minimization is performed in MATLAB using the backslash command with an over-constrained matrix equation. The matrix equation solves for the eight unknown constants in the Homography. The ninth element in the Homography is assumed to be 1.

Compositing a Mosaic to form a Panorama Scene

In the next part of the assignment, I used the Homography to stitch several images together to form panoramas.

The stitching process involves several steps.

  1. Manually selecting correspondences between the images.
  2. Computing the Homographies that will project each image onto the center image plane.
  3. Masking the images for an unnoticeable transition. Below is an alpha filter used to feather adjacent images.
  4. Placing the masked images to form the final composite image.

Part 2: Automatic Correspondence Detection

Even though part 2 was implemented after part 1, it involves the identification of correspondence points between overlapping images that part 1 requires.

The automatic detection of correspondence points is performed in several steps, outlined below.

  1. Computing Harris Corners.

    The Harris Corners computation finds locations in each image that may correspond to a corner. Intuitively, corners are easy to correlate between images.

  2. Adaptive non-maximal suppression.

    Adaptively non-maximally suppressing the set of Harris Corners points. The set of Harris Corners is very large and must be reduced down. However, simply removing the strongest corners will leave points along edges in the image and will not allow for a set of points that is evenly distributed throughout the image. To reduce the set of points while getting a good distribution of points, an adaptive, non-maximal suppression approach is taken.

    For each Harris Corner point, the minimum suppression radius is the minimum distance from that point to a different point with a higher corner strength. The points are ordered based on their suppression radius, and the 500 points with the highest radius are kept, while the others are discarded.

    The images below show the original set of Harris Corners and then the set of 500 suppressed corners for two input images.



    Image 1 Harris Corners and ANMS Points Image 2 Harris Corners and ANMS Points
  3. Extract Feature Descriptors.

    At each of the remaining points, an 8x8 feature descriptor is extracted to characterize the points. The feature descriptors are taken from a 40x40 pixel window centered on the point, sampled every 5 pixels. The image data is first blurred before sampling so that only the low frequency data is represented in the descriptor. The images below show feature descriptors for the first several points in two input images. The values in the descriptors are taken from the red channel of the image data and are normalized to have zero mean and 1 standard deviation.



    Image 1 Feature Descriptors Image 2 Feature Descriptors
  4. Predict matches between both images

    The feature descriptors are used to compare each feature point in one image with the next. Comparison is performed with the sum of squared error calculation. The points with the least error are considered for matching. To determine if a point should be considered a match, the ratio between the lowest and second lowest errors is computed and compared with a threshold.

    The point pairs that are within the threshold are considered candidates for matching points. The candidates for two input images are shown in the image below.



    Image 1 candidate points Image 2 candidate points
  5. Remove outliers with RANSAC

    A robust algorithm, 4-point RANSAC, is used to eliminate outliers from the candidate correspondence points. Four points are randomly selected and a Homography is computed based on those points. The remaining points are then transformed with the Homography and their transformed coordinates are compared with the matching point coordinates in the second image. Outliers are removed and the process is repeated, keeping track of the largest set of inliers. This is used as the final set of correspondence points.

    The images below show the initial candidate points in red and the final inliers in yellow. Red points that are not outlined by yellow are outliers removed through the RANSAC process.



    Image 1 RANSAC Results Image 2 RANSAC Results

Final Correspondence Points

The final correspondence points are taken as the inliers after the RANSAC process. The correspondence points between the two sample input images are shown below.

Image 1 final correspondence points Image 2 final correspondence points

Part 1 Results

Homography

I took lots of outdoor photos for stitching, but did not have any single images interesting for testing my Homography function. So I found a nice sample image of a super car from http://www.shiotsu-used-car.com/blogpics/orochi.jpg, shown below.

Super Car

For the first trick, I decided to see what the car would look like from the side. The original correspondence points are shown in blue and the target projection points are in red.

The resulting image; a view of the car from the side

Next, in car show-off fashion, I decided to generate several more views of the car from different view points. Exactly what the Homography is good for.

A high-up view that is a little far away from the car.


A down-low and close-up view of the car.



I also transformed a straight on view of a car wheel into an agled view. This is almost an inverse of the first Homography that I applied to the super car. The original wheel, the wheel forced to become straight on, and a slanted view of the wheel are shown below.




Mosaic Stitching

Once I was able to verify that my Homography calculation was working, it was time to put it to use stitching together some photos.

The photo below was taken using a tripod on the top of my friend's apartment building near Craig and Center. It is an interesting view of Oakland that is not exactly typical. The full size image can be downloaded here

Panoramia photo of Oakland in Pittsburgh.

The following scene is of a small pond in Schenly Park.The full size image can be downloaded here

Schenley Pond

The intersection at Forbes and Roberto Clemente drive has many interesting things, including the Cathedral of learning, a log cabin, a dinosaur sculpture, the Carnegie Library and nice trees. The following panorama tries to capture all of these things. The full size image can be downloaded here

Oakland intersection

Part 2 Results

The Cathedral of Learning and Oakland from a roof-top. The full sized image is here.

A view from Flagstaff Hill near in Schenley Park. Full sized image file. The Cathedral of Learning is in this photo too!

A second view from the roof. Full sized image.


William Wedler. 8, Nov. 2007