Project: Coloration of the Russian Empire

Sergei Mikhailovich Prokudin-Gorskii (1863 - 1944) was a Russian photographer who has won the Tzar's special permission to travel across the vast Russian Empire and take color photographs of the various facets of daily life. However, since there was no ready-made equipment to display color photographs, he recorded three exposures of every scene in three channels: red, green and blue. These exposures were recorded on glass plates, which found their way to Library of Congress. This project takes the separate exposures and attempts to combine them in the manner they were intended to be.

The Approach

The main problem to overcome was the issue of aligning the photographs correctly. This could be done by hand, but we want to automate this process. So there are two approaches to calculate the "closeness" of the separate channels - the sum of squared differences (SSD) or normalized cross-correlation (NCC). In theory, NCC is supposed to operate better than SSD due to the different lighting levels in the different color channels. However, after trial and error, there is little discernible differences between the two algorithms; SSD ran faster than NCC so SSD was the algorithm of choice. Essentially, using the blue channel as the base channel, the red and green exposure was shifted in an window of some width and height, and scored using the SSD algorithm. The shift giving the smallest score was used, and the three exposures are concatenated together using the blue and now-shifted red and green channels.

The above was the naive solution for small images. It took too long to run the algorithm on larger iamges if a search window size that was proportional to the image size was used, as the run time of circshift is proportional to the total size of the matrix. Thus a reduction of a single pixel of search space on a high resolution picture would give a large boost in performance, however search space would be limited. Thus, an image pyramid was used: pictures were scaled by half recursively, until an image smaller than 64-by-64 was obtained. At every stage, a guassian filter was applied to the image to remove aliasing. A coarse-to-fine alignment algorithm was used: starting from the smallest image, mark the first displacement with the lowest score, then move up the image pyramid. The displacement would be scaled by two, and the red and green exposures are shifted by this initial displacement. The search is then run again from this initial displacement, and the new final displacement is propagated and scaled up the image pyramid, until the the original image has been aligned. This algorithm reduces the need for a large search space on the higher levels of the image pyramid which speeds up the process tremendously.

Problems encountered

At first the algorithms worked fine with the smaller images, but it did not work as well for the larger images. First median-filtering was applied, as it was observed that there were some spots and blemishes in the pictures, and median filtering could remove these high spots. However, there was no visible improvement in image quality after the application of median filter. Another approach was used: since ssd was scoring based on "closeness", was there another feature in the picture which was artificially raising the closeness value? I.e. was there a common feature in the pictures but not part of the subject? It appears that there are borders present on the edges of the pictures which are common in all the pictures but are not part of the subject - this encouraged the pictures to align in accordance with the borders, which clearly had no bearing on the alignment of the subjects themselves.

Two approaches were considered in the removal of the borders. One was to calculate some threshold values, and crop out the border areas which did not meet this threshold. The other method was to crop some arbitary area of the picture. The first approach turned out to be ineffective because in some pictures, the borders had similar intensities with some areas of the photots, which meant that it was impossible to set some threshold to remove this areas - either the borders were ignored or areas in the picture were removed along with the borders. It was then noticed that the size of vertical borders were pretty much constant in all the images, thus the second approach was utilized. After testing a range of values from 5 to 25%, a removal of 10% of the picture of all edges returned the best results.

The picture on the left is the initial result before the removal of the borders. The picture on the right is the result after the removal of the borders. Displacement of the red and green channels are shown below.

LEFT

RIGHT