Images Of the Russian Empire
Computational Photography: Project 1 (and Project 1g)
Matt Mukerjee
(mukerjee at cs)
Overview:
In this project I seek to provide a software artifact capable of
providing an automated approximation of colored images from the
original Prokudin-Gorskii plates, whilst minimizing artifacts. To
provide background, the original plates contain three separate
images (corresponding to the B, G, and R color channels) of various
scenes throughout the Russian empire in the very early 1900's.
Here is a small example:
In order to do this alignment process, I seek to find the alignment
that best satisfies some correctness metric across the color channels.
Initially, sum-of-square-difference (SSD) was used, but it didn't
provide good results. To provide much better results, normalized
cross-correlation (NCC) was instead implemented. Once a clear
correctness metric was defined, the question of performance came into
play. When operating on smaller images, a brute-force search over a
small window provides decent performance. In the context of this
project, a window of [-15, 15] was searched in both the x and y
direction (we assume a translation model is sufficient to model the
image discrepancies). This method will obviously not scale well as
image size (and proportionally, window size) increases. Thus, a much
smarter technique must be employed to reduce asymptotic runtime and
improve performance. Instead of brute-force searching for the
displacement vector, we simply create a Gaussian pyramid and estimate
the displacement vector over this multi-scale pyramid. We first find
the best displacement vector at the coarsest (smallest) scale, then
proceed to successively correct this estimation at each larger scale in
the pyramid. This leads to a clever observation: a window of size [-k,
k] over some image with scale 'n' covers the exact same part of the
image that a [-2k, 2k] window over that same image with scale '2n'.
This implies that moving towards smaller scales within the Gaussian
pyramid allows us to limit the window we need to search over. In this
vein, I choose to go up the pyramid 4 times, searching over a window of
size [-6, 6] in both x and y. This lead to seemingly correct alignment,
while taking a reasonable amount of time. Finally, one problem I ran
into was that the borders of the channel images tend to screw up the
similarity metric as they generally don't have a matching pixel in the
destination image. A simple solution to this is to sample a centered
i/2 x j/2 sized patch from each i x j sized image. This removes the
boarder entirely, eliminating this problem.
Results:
Small Images: (Green alignment, Red alignment)
(3,3) (4,5)
|
(4,1) (9, -1)
|
(6,6) (11,8)
|
(2,3) (5,5)
|
(6,1) (12,1)
|
(1,2) (4,3)
|
(2,0) (6,0)
|
(1, -1) (13, -1)
|
(1,1) (4,2)
|
(6,2) (14,4)
|
|
|
Large Images: (Green alignment, Red alignment)
(38,19) (91,37)
|
(48,40) (108,57)
|
(35,25) (52,37)
|
(15,6) (49,14)
|
(57,25) (125,34)
|
(16,3) (45,5)
|
(48,13) (113,19)
|
(-15,10) (11,19)
|
(42,32) (111,59)
|
(48,13) (113,19)
|
|
|
Additional Images: (Green alignment, Red alignment)
(72, -3) (143, -18)
|
(75, -7) (148, -28)
|