Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection

Nick Vandal

Overview

Sergei Mikhailovich Prokudin-Gorskii (1863-1944) travelled across the vast Russian Empire and took color photographs of everything he saw, by recording three exposures of every scene onto a glass plate using a red, a green, and a blue filter.

The goal of this assignment is to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible. In order to do this, we needed to extract the three color channel images, place them on top of each other, and align them so that they form a single RGB color image.

Alignment Technique

I perform Normalized Cross Correlation (NCC) in the spatial domain over a limited range of shifts. My NCC code is implemented in a .mex function for maximum performance, and implements standard spatial correlation using integral images (summed area tables) for computing the normalization coefficient as suggested in Fast Normalized Cross-Correlation. Creating mex files is relatively straightforward -- the only potential pitfall is to remember that MATLAB (like FORTRAN and many numerical libraries such as FFTW and BLAS) employs column-major ordering in matrices instead of the standard C/C++ row-major ordering. For maximum performance, I've multithreaded the mex function using the shared memory multiprocessing API OpenMP. OpenMP is widely supported in gcc, Visual Studios, and Intel compilers. To parallelize the 4 nested for loops required for spatial correlation, all that was required was a single compiler directive:
#pragma omp parallel for private(...) schedule(static,1)

Where private(...) contains a list of all private variables for the code block (ie. nested loop indices, intermediate sums, etc.) .

Usage for my NCC mex file is: [correlationPlane] = spatial_subsearch(data,template,ROI).
Where ROI is an optional vector of the form [yMin xMin yMax xMax], that limits the search area (where correlation is actually performed). For this project, I found that ±25 (vertically and horizontally) from the center to provide good results for the single scale implementation with acceptable runtimes for the small scale images. For the multiscale pyramid version of the code,  ±5 resulted in good alignment and acceptable run times even for large scale images.

This NCC mex file takes care of the most compute intense portion of the project and all further modules are written in MATLAB. For all implementations, I perform a 10% crop on the template images (green and red channels) to demphasize the mismatched border regions of the negatives. For the single scale implementation, I find the peak magnitude of the correlation plane and subtract the size of the template to extract the offset values {x,y} of the BR channels. For the multiscale pyramid implementation, I perform recursive calls to my getOffset(template,A,center,SA,level) function, passing half-scaled versions of template and A, and decrementing the level variable with each recursive call. My base case is when level has been decremented all the way to zero, or when the size(A)/4 < size(SA) (ie. the search area is greater than one fourth of the total reference image; this serves as a good heuristic to prevent the template from drifting to far away from center after the loss of resolution). As my function returns from each level of recursion, it updates its estimate (scaled by 2) of where the offset should be from the lower level, and uses this as the center from which to perform NCC over a limited range.

Automatic Cropping

In an attempt to automatically crop the border of the aligned images, I noted the abrupt color changes that are normally aligned with linear boundaries. In order to exploit this, I first convert the color image to a variety of color spaces (LAB, HSV, CMYK) and perform Canny edge detection in the channels where the boundary seemed more distinct than in RGB (especially the AB channels in LAB, and the S channel in HSV). For each channel, the binary image was then fed into a Hough transform to extract lines. For those peaks in the Hough transform corresponding to lines of angle 90°±3 or 0°±3, I found approximatly vertical and horizontal lines (color discontinuities) and selected those in the minimum position in the upper quintile (spatially), and the maximum position in the lower quintile (spatially) of the image. These values of [xmin,xmax,ymin,ymax] for each color channel are then combined to create a bounding to crop the image.

Small Scale Images

Original Aligned Result Cropped Result

Large Scale Images

Cropped Result