Girish Jattani's 15-463 Image Reconstruction Using RGB Channel


Basic (Non-Pyramid) Approach

The basic approach to combining the images was taking the SSD and normalizing it with respect to the number of pixels examined in that particular translation.

My implementation first focused on extracting out correct images on the smaller pictures. To align the channels I reasoned that error in the images could not be more than 5% of the particular dimension in question. This means that I would only track upto 10% vertically or horizontally. This also aided in reducing the amount of time Matlab took to process the images.

This basic scheme proved simple enough, but failed to handle the sharp contrasts at the borders of the images. Some images in particular either had artifacts from fading of the photographic plates, or simply had borders from partial exposure. Nevertheless, image quality was quite good on the smaller photos, overall:

Vase:

Painting against the window:

Bridge:

Czar:

Lady on the balcony:

Waterside town:



In the following two images, I could not quite crop out the borders using fixed percentage widths to fully eliminate the effect on the SSD algorithm. It appears as though the photos were taken slightly on an axis. Another potential issue the high dynamic range of the pixel data in these pictures; errors in one area of a picture may be aggravated falsely keying an incorrect translation as the correct one.


Railroad cars:


Greenery:




Handling Larger Images - Image Pyramid Scheme

The second phase of this assignment was to be able to handle very large images without having to manaully process the entire image. Rather, down-sampling the image so that a quick coarse estimate of an offset could be obtained. This offset would then be applied to the original image, to recurse again and switch for another better match. I chose to implement this iteratively as a loop, while keeping my results from the previous iteration to use for the current iteration.

Due to the way I organized this code, I thought an appropriate thing to investigate was the degree of downsampling one could get away with before getting an unreliable result at the base levels of the pyramids. There are two approaches to using the pyramid efficiently, each with it's own drawbacks. One of which involves using a longer sequence of scales from a deeper depth in the pyramid (i.e. from a very small resolution to a bigger intermediate one), and the other which involves iterating only on slightly downsized images for a shorter depthe in the pyramid of scales. I chose to use a factor of 2 between image pyramid slices.

Scaled from 1/32nd to 1/8th


Scaled from 1/16th to 1/8th


Scaled only at 1/8


Scaled from 1/8th to 1/4th






Railroad track and car:
Scaled from 1/32nd to 1/8th


Scaled from 1/16th to 1/8th


Scaled only at 1/8th


Scaled from 1/8th to 1/4th






Woman on sofa:
Scaled from 1/64th to 1/16th


Scaled from 1/32nd to 1/8th


Scaled from 1/16th to 1/8th


Scaled only at 1/8th






Tea-set:
Scaled from 1/32nd to 1/8th


Scaled from 1/16th to 1/8th


Scaled only at 1/8th


Scaled from 1/8th to 1/4th


Of the 4 options that I tried on these images, the fastest running was the 1/8 scaling, because it only ran at one level. Even more amazingly, it performed almost as well as the 1/8 to 1/4 image slices. As was previously stated the tradeoff was for greater speed, but it's also interesting to note that the algorithm has slightly worse output when the largest image slices were used. My guess is that alot of the artifacts in the image were shrunken away, but now form enough of a part of the image in these larger slices to throw the SSD calculations off. Another consideration is that borders that were part of the plates were also minimized at the lower resolutions.