CS 180: Intro to Computer Vision and Computational Photography

Images of the Russian Empire: Colorizing the Prokudin-Gorskii Photo Collection

Natalie Wei (3037990373)



Overview

In the 1900s, Sergei Mikhailovich Prokudin-Horskii traveled around the Russian Empire performing groundbreaking work in color photography. Prokudin-Horskii recorded various scenes three times onto red, green, and blue filtered glass plates. Given digitized versions of these glass plate negatives, my goal for this project was to extract the r, g, and b channels from their respective plates and align them together, forming a single RGB image to recreate a fully-colored photograph. First, I implemented an exhaustive search algorithm that aligns small images; I also implemented pyramid search to align large images efficiently. Finally, I polished my results with some bells and whistles.

Section I: Exhaustive Search

Overview

To align small .jpg files, I split each image into its r, g, and b channels, treating the b channel as the template. Then, I performed naive grid search by exhaustively searching through a window of displacements [-15, 15] pixels large. I scored each (x, y) displacement with an image matching metric, keeping track of the one with the best metric and returning it as the most optimal displacement. Finally, I used circular translate to displace the r and g channels, stacking all of the channels on top of each other to produce an aligned and fully-colored photograph. After comparing the results of minimum Euclidean distance with those of the maximum normalized cross-correlation, I settled on NCC as my image metric, as it seemed to produce better alignments.

Some border artifacts were affecting the calculations, so I hardcoded cropping by slicing a 15% margin from the sides of each channel before alignment. I realized that I should align the uncropped images for my final result, as the images looked zoomed in otherwise.

Results

cathedral.jpg

g: (2, 5), r: (3, 12)

monastery.jpg

g: (2, -3), r: (2, 3)

tobolsk.jpg

g: (3, 3), r: (3, 6)

Section II: Pyramid Search

Overview

Performing exhaustive search on large images becomes prohibitively expensive, as the window size must scale with the image size. To efficiently align even large .tif files, I implemented a recursive pyramid search method that aligns .tif files in about 10 seconds.

Conceptually, I start with the original image, then scale down until the image is small enough for naive exhaustive search. As I travel back down the pyramid, I scale each image and displacement from the previous level until I arrive back at the original image.

For images smaller than 256 x 256 pixels, I performed naive exhaustive search with a window size of [-15, 15]—this served as my base case. Otherwise, I scaled both images down by 2 and recursively calculated their best displacement. I scaled this displacement up by 2 so it would accurately align an image with twice the height and width, then rolled the original image by the scaled displacement. Finally, I performed exhaustive search to find the displacement between the rolled image and the template. I incremented these displacements for my final displacement.

To optimize further, I also updated the window size for exhaustive search at each level. The smallest image uses [-15, 15] window, which is decreased by 2 at each subsequent level until reaching a minimum range of [-4, 4]. This way, I'm not searching across a wider range than necessary at any level.

The pyramid search method worked on all of the images except for emir.tif, likely due to the difference in brightness between its red and blue channels. I address this in Bells & Whistles.

Results

church.tif

g: (4, 25), r: (-4, 58)

three_generations.tif

g: (13, 52), r: (11, 111)

melons.tif

g: (10, 81), r: (13, 178)

onion_church.tif

g: (27, 51), r: (36, 108)

train.tif

g: (6, 42), r: (32, 87)

icon.tif

g: (17, 41), r: (23, 89)

self_portrait.tif

g: (29, 78), r: (37, 176)

harvesters.tif

g: (17, 59), r: (13, 123)

sculpture.tif

g: (-11, 33), r: (-27, 140)

lady.tif

g: (8, 52), r: (11, 113)

emir.tif

g: (2, -3), r: (2, 3)

Section III: Bells & Whistles

Edge Detection

I explored two different ways of correcting emir.tif: color mapping and edge detection.

Simply aligning the b and r channels to the g channel instead of the original color mapping produced an improved result.

I also looked into edge detection with Sobel filters, which can locate edges by finding pixels with a larger relative magnitude. Instead of comparing brightness values between pixels, I aligned images based on their edges, producing improved results as well.

Results

Original

g: (2, -3), r: (2, 3)

Color mapping

b: (-49, -24), r: (57, 17)

Edge detection

g: (23, 49), r: (40, 107)

Auto-Contrast

To keep my results from looking washed out, I applied adaptive histogram equalization on the final images. Histogram equalization improves contrast by redistributing an image's luminance values. Adaptive histogram equalization does this locally for multiple parts across an image, yielding more natural-looking results.

Results

train.tif

Without contrast

lady.tif

Without contrast

three_generations.tif

Without contrast

With contrast

With contrast

With contrast

Auto-Crop

Most of the images have black and white borders that cause artifacts when aligned. To reduce the artifacts, I created an auto-cropping method that removes borders before alignment.

I created functions that crop the left and right sides as well as the top and bottom. First, I look through a “strip” of pixels and calculate how many “black” and “white” pixels there are based on how many pixels fall under certain thresholds. I divide the number of black and white pixels by the size of the strip, then crop the strip if it meets a predetermined ratio. This method proved to be more effective at cropping thick and wonky edges than taking the average pixel brightness or going line by line.

I cropped the left and right borders from the original image before splitting it into its color channels, then cropped the top and bottom borders from each channel. I aligned the cropped images to achieve a final result with less artifacting.

Results

self_portrait.tif

g: (29, 78), r: (37, 176)

onion_church.tif

g: (27, 51), r: (36, 108)

g: (29, 39), r: (36, 16)

g: (26, 51), r: (36, -42)

Section IV: Additional Images

siren.tif

g: (-7, 48), r: (-21, 96)

caravan.tif

g: (23, 68), r: (40, 147)

bridge.tif

g: (-7, 13), r: (4, 68)