15463 - Project 1 - Paul Miller

I started with a basic implementation based on the built-in normxcorr2 function. This worked pretty well, even on the large images. (It finished 00029u.tif, the coastal scene, in 34 seconds.) However, there is no way to instruct normxcorr2 to only search a certain range, so (as far as I know) it cannot be used with the pyramid optimization, in which alinging the images at a lower resolution narrows the search area for the higher resolution.

So I created another, much lengthier implementation, which recursively searches smaller and smaller images. I experimented with different parameters for the recursion, and found that scaling the images to 1/5 size until the smallest image was no more than 200 px wide was a good balance between speed and reliability. I found that making the images too small resulted in bad alignments, possibly because there isn't enaugh data. I also found that letting (sigma) = (filter size) / 4 worked pretty well in my image scaling function.

Unfortunately, after tweaking these parameters as well as I could, the second implementation was still much slower than the first; the second finished 00029u.tif in 56 seconds. This may be because using the built-in normxcorr2 was faster than my own equivalent function, and because the first implementation avoided using any loops. My second implementation, using the proscribed techniques, works fine. But since they both generate practially the same output, I used my first implementation to generate these images.

My program failed on one image, 00757v.jpg. This image does have an unusually large, blue sky, which makes a large area in the blue channel which is very different from the same areas in the other channels. This idea is supported by the fact that the green and red channels lign up correctly, but the blue channol is way off in the vertical direction. However, since I was using normalized cross-correlation (rather than SSD or zero-mean) I would expect the relatively flat, featureless sky not to be a problem. 00757v also has relatively severe noise on the left of the image, but since I crop out an eighth-width area from all sides before matching, I would expect this also not to be a problem.

For each compostition, I treated the red channel as the "bottom" and positioned the green and blue channels over it. Each green and blue channel has a row and column offset indicating their position, where (0,0) has the upper left corner of the blue/green channel aligned with upper left corner of the red channel.

image green row offset green column offset blue row offset blue column offset
-53 -20 -91 -35
-60 -16 -108 -56
-6 1 -10 0
-18 -14 -52 -39
-44 -28 -88 -33
-35 -9 -50 -15
(failed)
-4 -3 207 -14
-69 -9 -126 -34
-8 -1 -13 -2
-4 -2 -5 -4
-27 -2 -43 -5
-4 0 -7 -1
-13 0 -14 0
-4 -1 -5 -3
-27 -9 -12 -18
-48 -15 -72 -34
-7 -1 -13 -2
-9 -3 -15 -5
-70 4 -136 -14
-55 -16 -93 -42
-63 -17 -111 -40