Overview

In this project, we create color images by composing sets of 3 mono images. Each set of three mono images are taken on the same object, in the same view, each filtered with a different colored glass plate, say R,G and B respectively. This method generates comparably similar images with modern digital camera’s output image separated into 3 different color channels. Taking these mono images, we calculate (x,y) offsets between mono images to align correctly, stack them into 3-channel image format, and save it as a result image file.

Approach

For low resolution images (<1mb), I searched over 15-by-15 size offsets to find the correct (x,y) displacement. The distance between images are calculated using L2 norm for the entire pixels, and we pick the offset with minimum distance as the one for optimal matching. To accomplish this, 1) I set nested ‘for’ loops, and in each loop 2) one of the image is shifted using Matlab’s circshift function, 3) then calculate, record, and compare current L2 norm with minimum distance of the previous record.

For high resolution images (>70mb), the distance calculation becomes too costly, thus smaller size (4-by-4) offset searches are used for several levels of scaled images. Nested loops basically conduct same functionality. The following pseudo codes describe conceptually how this iterative function call works.

function [ I_result, x_off, y_off ] = align_pyramid( I1, I_ref, step, x_off, y_off)
  %% check if the current call is the base case of recursion
  if (step ~= 1)
    % if it is not the base case, proceed to another iterative call
    % size down images to half and pass them as arguments for next function call
    % also pass initial displacement (0,0) as an argument

    [~, x_off, y_off] = align_pyramid(I1_scaled, I_ref_scaled, step-1, x_off, y_off);
  end

  %% shift image using displacement estimation calculated from the next scale level
  %% calculate and compare L2 distance (same as align.m)

  % cut off edge area (20% of image size) to calculate L2 norm correctly
  % update displacement estimation for the next scale space. x and y are multiplied by 2
end

Result

Here are result images for low-res files.

Offsets between 2 channels (a pair selected from R,G,B) calculated for low-res images are:

     (G-B) (R-B)
00106v:  (4, 1) (9, -1)
00757v:  (2, 3) (5, 5)
00888v:  (6, 1) (12, 0)
00889v:  (1, 2) (4, 3)
00907v:  (2, 0) (6, 0)
00911v:  (1, -1) (13, -1)
01031v:  (1, 1) (4, 2)
01657v:  (5, 1) (11, 1)
01880v:  (6, 2) (14, 4)


Here are resized result images for high-res files. Click the image to download full size .tif file.

For high-res image:

     (G-B) (R-B)
00029u:  (38, 18) (90, 36)
00087u:  (48, 39) (108, 56)
00128u:  (35, 25) (52, 38)
00458u:  (42, 6) (86, 32)
00737u:  (15, 6) (49, 14)
00822u:  (56, 25) (124, 33)
00892u:  (16, 3) (42, 4)
01043u:  (-15, 10) (11, 18)
01047u:  (24, 19) (71, 33)


Here are result images of own choosing from Prokudin-Gorskii collection. Click the image to download full size .tif file.

Failure case: 00911v.jpg

As in the following figure (Left), 00911v.jpg was initially misaligned. Without any preprocessing on the images before L2 distance calculation, 00911v.jpg has misalignment between channels even after min-distance calculation. This is due to incorrect distance estimation on boundary regions on which different channels usually don't agree each other.
To resolve this issue, I added boundary removal before calculating L2 norm, and it improves alignment for low-res images as it is shown in the right image. Same trick was applied for high-res image (align_pyramid.m). The result will be more precise if it's done AFTER circshift, but the difference is negligible for our examples.

Figure # Figure #

Bells & Whistles

Additionally, I implemented codes (please see hw1_boundary.m) that detects and paints white on the boundary region using the pixel values of 3 channels. To do this, I calculated difference (simple subtraction) between each pair of channels (say r&g, g&b, b&r) and checked how much they agree with each other by applying threshold value. In the Sergei's collection, inconsistency between channels usually occurs on the boundary region: their brightnesses are different due to the irregular exposure. Consequently, the more 2 channels agree with each other, the smaller a summation of those pixel difference values. To reduce noise, several steps of filtering were applied (opening, dilation, etc.).
Following figures show the result of automatic detection and marking of boundary regions (left: before, right: after). Rest of the result files are stored in ./jpg/ folder.

Figure # Figure #

Figure # Figure #