Project 1: Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection

15-463 (15-862): Computational Photography
Project 1: Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection

Jun-Yan Zhu

In this project, We take the digitized glass plate images from Prokudin-Gorskii collection, produce a RGB color image via standard image matching techniques, and automatically crop image borders to further improve image quality. The image matching method includes a template matching based on Sum of Squared Differences(SSD) metric and a speed-up trick based on Gaussian pyramid. With respect to image cropping, we first detect edges with sobel filter, then estimate long horizontal and vertical edges, and cut bad regions with strange color.

To align R, G, B images well, the first important thing is to find what is the property which different channels actually share with each other. There are different assumptions we can use to solve this problem. We just list three assumptions here:
1. R, G, B channels share same/similar (coarse) contour maps (shape).
2. R, G, B channels share same/similar texture maps (gray images).
3. R, G, B channels share same/similar keypoints (or highlight patches/parts)

In this project, we use the second assumption and perform image matching algorithm based on L2 norm distance or so called Sum of Squared Differences (SSD). But we should be aware that the second assumption does not always work. For example, for a T-shirt with blue and red strips, gray images of B, R channels are totally different from gray images of G channels. In this case, we argue contour map matching and keypoint matching might be a better choice. For edge map, it capture the rough shape of objects and scenes which will probably be held in all channels. For keypoint matching, the highlight parts (e.g. corners, edges) of three channels might be similiar. But extracting feature keypoints and detecting contour and edges are relatively time-consuming, especially for large images. Besides, in general cases, gray image matching could work well enough,

To handle large images, Gaussian pyramid is a classic trick which could be used to reduce computational cost. The general idea is you first perform image matching in the coarsest scale. The results offset you get in the coarser scale could be a initial guess in the finer scale. You only need to search a relatively small range to do fine adjustment. The MATLAB code of recursive calls is presented as follows:

function [offset] = pyramidAlign(img1, img2)
    if size(img1, 1) > 128
        I1 = impyramid(img1, 'reduce');
        I2 = impyramid(img2, 'reduce');
        coarse_offset = pyramidAlign(I1, I2);
        offset = simpleAlign(img1, img2, coarse_offset * 2);
        offset = simpleAlign(img1, img2, [0, 0]);

Since the height of coarest pyramid image is less than 128 pixels, search range of image matching algorithms for the coarest scale could be set as [-5, 5] pixel to reduce running time. In each finer scale, we allow adjustment range as [-1, 1] pixel. So the overall computational cost is largely reduced. We use only central image regions (green rectangles in the below image) by discarding image border regions. You can see bad artifacts (black regions and white regions) in image border regions (red rectangles). There are two reasons to keep only central image regions as the input of image matching process:
1. If corresponding part of two images could match well, the whole images should match well too. But by only matching part of images, we could further reduce computational cost.
2. Image border regions always suffer from severe artifacts (including edges betweem color images and black regions, and edges between black regions and white regions). We would not like to introduce these noisy regions to image matching process.

Bells and Whitles:

To automatically crop images, we observe strong and long edges between good regions and artifacts (for example, yellow lines in the following right image) might be a good hint. So we design a simple algorithm to detect long edges and get rid of artifacts based on detected edges. Since long edges would probably appear in a natural image itself. We focus on image border regions (red rectangles in the above image) to avoid wrong detection. Our long edge detection algorithm is quite simple. First we use a sobel filter to get an edge map (represented as a logical matrix, each element belongs to {0, 1}). General long line/edge detection is not a trival task. Fortunately, in our case, most edges between good image region and bad artifact region are horizontal or vertical. The task is reduces to horizontal long edge detection and vertical long edge detection. So we only need to count white pixels (one) by row or by column. If the number of white pixels is larger than 1/3 of image width/image height, we consider the horizontal/vertical line as long edges. We find all the long edges inside image borders, compute bounding boxes based on long edges, and crop images.


We show image results including small images, large images and images we choose from Prokudin-Gorskii collection. Each image result contains aligned color image, cropped color image, offset, and running time.

Small Image Results

Green [-3 -1] Red [-6 0] Time: 0.095136

Green [-5 -1] Red [-12 -1] Time: 0.036419

Green [-4 -1] Red [-9 1] Time: 0.112313

Green [-1 1] Red [-13 1] Time: 0.059040

Green [-6 -1] Red [-12 0] Time: 0.071549

Green [-2 -2] Red [-4 -3] Time: 0.107645

Green [-1 -1] Red [-4 -2] Time: 0.045582

Green [-2 -3] Red [-5 -5] Time: 0.071948

Green [-6 -2] Red [-14 -4] Time: 0.093572

Large Image Results

Green [-35 -25] Red [-51 -38] Time: 3.425422

Green [16 -10] Red [-11 -17] Time: 3.455696

Green [-24 -20] Red [-71 -33] Time: 3.359867

Green [-39 -16] Red [-91 -34] Time: 3.434431

Green [-15 -7] Red [-47 -15] Time: 3.535858

Green [-16 -2] Red [-43 -4] Time: 3.574786

Green [-48 -38] Red [-108 -55] Time: 3.468998

Green [-42 -6] Red [-87 -32] Time: 3.472751

Green [-57 -25] Red [-125 -33] Time: 3.482575

Prokudin-Gorskii Collection Results

Green [-5 -13] Red [-59 -25] Time: 3.323220

Green [-51 -7] Red [-125 6] Time: 3.243400

Green [-5 0] Red [-11 2] Time: 0.044838

Green [2 -1] Red [1 -3] Time: 0.043925

Green [-3 -2] Red [-8 -3] Time: 0.044204