Project description
  What worked
  What didn't
  JPEG Results
  TIFF results
  LoC results
  Credits
About    Ricardo Cabral (rcabral).
   15-862 Computational Photography, Fall 2010.
   Project 1.

› Project description

pen.gif   This project consisted in colorizing a series of negative plates from the Library of Congress collection. These photographs were taken by Prokudin-Gorskii in quite a peculiar fashion for the time: each picture was taken three times, each using a colorized filter with the colors Red, Green and Blue. The goal of this project was to automatically align the three layers and produce a colorized image.

Back to top

› What worked

pen.gif   It is assumed, as the assignment stated, that the registration is only supposed to have translation displacements.
The final algorithm proceeds as following:
1) Split the image in 3.
2a) JPEG images: Register using a single grid search.
2b) TIFF images: Register using grid search and scale pyramid.
3) Concatenate channels into one single picture.
BW1) Perform (Photoshop alike) auto levels.
BW2) Perform (Photoshop alike) auto color.
BW3) Crop the image.
For Step 1) we tried just splitting the image row indices in equal thirds, but this often resulted in one channel having a bit of the next/previous one, due to the asymmetry between the top and bottom white bars caused by the scanning process. To avoid this, we use an edge detection algorithm ('Sobel') and crop based on the innermost first or second maxima for the top and bottom halves of the image. This works because the edges caused by picture limits are around the same position in the 3 frames, so an edge should be detected in the majority of elements of the column of the image and hence have a bigger average value than other columns.

For Step 2) we just use a [-15,15] window for the JPG images and [-5,5] window along 6 pyramid levels for the high resolution TIFs. Worthy of note is that we don't a circular shift but instead register only the minimum subset of the image (i.e., the image minus a margin defined by the maximum translation possible at each level), thus avoiding spurious values on the pixels which might interfere with the cost function.

In Step 3), we recalculate the minimum margin possible since the actual translation is possibly smaller than the upper bound used on the previous step. We do this step in order to preserve information that is present in all the channels to be cropped by a too aggressive algorithm (even though this one would produce less artifacts around the edges).

Bells and whistles were applied using the following rationales: Auto levels is just a basic histogram stretching (cf. [1]) but applied on HSV space only to the saturation distribution. We do this to preserve color information (Hue) and because RGB histogram stretching performed too aggressively (see next section). Auto color (code taken from [2]) is done by transforming the image to the NTSC space (where we have access to the luminance and chrominance components of the image) and applying another histogram transformation. Both these processes try to mimic what their respective features in commercial software such as Adobe Photoshop does. Finally, the auto-cropping by detecting the portion of the edges of the image which is dominated by a single color (red, yellow, blue, green). We do this by checking the evolution of the average saturation (in HSV space), both on cols for left and right boundaries and rows for top and bottom borders, selecting the indices where the saturation drops to normal levels as the place to crop.

Back to top

› What didn't

untitled.gif   The optimization proposed only guarantees convergence to a local minima, even though the pyramid scheme alleviates this problem. Despite this added robustness, nothing prevents the image from having a global minimum in an undesired place (which is what happens for the alignment of the Red Channel in image 00153v, later confirmed in its high-resolution counterpart --- see LoC results). We theorize a number of possible reasons for this: 1) a slight movement by the subject/camera in between the frame capturing process; 2) the smudge in the bottom of the red channel providing high values for the cost function when aligned with the correct pixels, which are brighter than the smudge so it prefers aligning these with the bottom edge. 3) The SSD is simply not good enough for comparing colors between channels (notice the significant intensity disparity in the man's garment).

Some images (e.g.,00911u and 00564v) show a failure case for the auto-cropping. This is due to the existence of two local minima on the mean of the saturation values, due to the black border being common in all channels. This could have been solved by thresholding the saturation derivative instead of picking the first minimum/maximum, but in doing so we would be adding a parameter to an otherwise fully automated algorithm. Since the result in this image is under cropping rather than over cropping, one might argue that this could always be adjusted manually afterwards for the small number of cases where it happens.

Besides the methods described in the previous section, we also tried using a median filter to remove the speckles in the image, performing histogram equalization [1] separately on the R,G,B channels or using the Gray World assumption to adjust the levels of of the R,G,B channels.

For the median filter, no noticeable change was present in the image, despite it's considerable computational complexity. At first, I thought this was due to the use of a small 3x3 window, but this phenomenon repeated itself on the JPEG images, which have lower resolution. I didn't want to use a larger window since some images (e.g., 01047u) have a lot of detail and this might have been lost.

Both histogram equalization (stretching) and gray world assumption resulted in poor quality images, with grain and greenish artificial coloring.

Back to top

› JPEG Results

untitled.gif   Click the pictures to see results individually.
  The running time for the whole process for JPEG images was around 1.2s on a 2.67 GHz Xeon QuadCore.

00163v
00564v
00149v
00351v
00398v
01112v
31421v (CMU???)
00125v
00153v

Back to top

› TIFF Results

untitled.gif   Click the pictures to see results in higher resolution.
  The running time for the whole process for JPEG images was around 60s on a 2.67 GHz Xeon QuadCore. result-00911u result-01043u result-00458u result-01047u result-01194u result-01657u result-01861a

Back to top

› LoC Additional Results

untitled.gif   Here are additional results for images personally picked on the Library of Congress collection. Click the pictures to see results in higher resolution.
  The running time for the whole process for JPEG images was around 60s on a 2.67 GHz Xeon QuadCore. 00132u 00388u 01262u 01346u 00369u 01375u 00221u 01266u 01366u 00153u 00403u

Back to top

› Credits

save.gif   Credit where credit is due:
[1] Rafael C. Gonzalez and Richard E. Woods. Digital Image Processing. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2001.
[2] Code for MATLAB FOTOSHOP
The template was taken from OSWD and it's author is Derrick Koenig - liquid.knights (at) gmail dot com

Back to top