15-663 Project 1

Overview

The goal of this assignment is to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible. In order to do this, you will need to extract the three color channel images, place them on top of each other, and align them so that they form a single RGB color image.

Combining 3 color-channel images

To find the most accurate alignment of the 3 color-channel images, I search for the alignment that results in the best metric value across the color channels. First, I tried using sum-of-square-difference (SSD) for metric, but it didn't yield good results. I switched to using normalized cross-correlation (NCC) which turned out to be a better metric. After finding the right metric to use, the next step is to make the algorithm scale to larger images. For small images, doing brute-force search over a displacement window of size [-20, 20] results in a decent performance. For large images, instead of brute-force search, we create a Gaussian pyramid of the image and estimate the displacement vector over this multi-scale pyramid. We first find the best displacement vector at the coarsest scale, then we update this estimate as we move down the pyramid to larger-scaled image. We also need to halve the size of the displacement window (with the minimum displacement being 2) as we proceed to 2x larger image since it will take much longer to compute the metric for large images. In this project, I choose to create the pyramid of height 4. At the coarsest level, I set the displacement window size to 16% of the length of the image's smaller dimension, as I assume that the displacement should not exceed this much portion of the image. This results in seemingly correct alignments, while achieving reasonable runtime. Lastly, I ran into a problem where the black borders of the image messed up with the computation of similarity metric. To overcome this, I cropped out the border and only computed metric on the i/2 * j/2 sized center part of the i * j sized image. This fixed the problem.

Results

Small Images (GR alignment)

G:(5, 1) R:(12, 1)	G:(6, 2) R:(14, 4)	G:(2, 3) R:(5, 5)
G:(1, 1) R:(4, 2)	G:(6, 1) R:(12, 0)	G:(1, 2) R:(4, 3)
G:(1,-1) R:(13,-1)	G:(4, 1) R:(9, -1)	G:(2, 0) R:(6, 0)

Large Images (GR alignment)

G:(42, 6) R:(87, 32)	G:(24, 20) R:(71, 33)	G:(48, 39) R:(108, 56)
G:(57, 25) R:(125, 33)	G:(38, 17) R:(90, 35)	G:(14, 6) R:(49, 14)
G:(16, 2) R:(12, 4)	G:(-15, 10) R:(11, 18)	G:(35, 25) R:(52, 38)