Project 1 Colorizing the Prokudin-Gorskii Photo Collection

Overview

In this project we are given 3 images which represents the Red, Green, Blue channels of a particular rgb image. Our job is to restore the image by aligning the 3 images together and restore the original colored image.

Approach

To align the images, we want to find a translation on a particular image such that the image is most similar with the second one. To justify similarity we have to define a metric to measure the similarity between two images. In this case, we just used the sum of square difference (SSD) on the value at each pixel. By squaring the difference, we can account for the negative values. We assume that the shift between images should be relatively small therefore we only search in a 30x30 pixel window. As we shift we keep track of the best SSD value seen so far. The best alignment is when two images have the smallest SSD across all their pixel values. Since images are represented as matrices of pixel values, we just have to substract the two matrices and square all the individual values and sum them up.

Cropping

After first round of testing, we noticed that each R,G,B image has a different sized border around the actual image of interest. In fact, since there border were essentially 0 valued, it was introducing significant noise to the best SSD value. Therefore, before we aligned the images, we cropped out the borders of the image.

Edges as Features

After the simple solution, we realized that some of the images were not aligning well because the pixel intensities were drastically different across the three channels for some images. Therefore, using just the brightness value was not the best choice. We decided to use edges features instead of brigtness for our calculation. We used canny edge detector to generate edge features and run SSD alignment approach as descripted above. One small optimization we did was to use xor instead of the strict SSD since we know that edge information are only 1s and 0s.

Image Pyramid

The above approach assumed that the images are small enough so that the 30x30 pixel would cover the best alignment. For a large image, this does not work since the best alignment could be much further away. Thus, we use a recursive image pyramid approach. At each recursive call, we gussian blur the image and shrink its size. We then align the smaller image and use that as the starting point for aligning the bigger image. Note at each recursive call back, the alignment vector needs to be doubled to account for the fact that the bigger image is twice as big.

Results

00029u 00087u 00128u 00458u 00737u 00822u 00892u 01043u 01047u