Final Research Results:
Generating Images with Arbitrary Focus Regions


11/29/99

Thomas Drake Jr

drakejr@andrew.cmu.edu

Research Paper in PostScript format

Current panoramic images allow the user move and zoom freely in the given texture environment (QuickTimeVR, Ipix and iMove). However, the user is limited to the depth of field of the images when they were acquired. This can greatly reduce the information presented in a panorama when objects are very close the camera. The depth of field, in this case, is slim for real cameras [1]; if real cameras could approximate the depth of field of a computer simulated pin-hole camera, this topic would be unappropriate. It would useful if a panoramas could be constructed such that everything in the scene is in focus. There has also been work done, where a single camera with an offset CCD captures the entire range of depth of focus from a scene [4].

I was notable to get the image registration working properly, so I only generated images from a single viewpoint.

Image Acquisition

I have acquired images from a single viewpoint, at various depths. I sampled images from a dense woodland scene and an art gallery, but my cameras depth of field became too deep when the closest objects came into focus. Because of the capturing software that was available to be, I was only able capture images at 320x240 resolution

Algorithm

When first attempting to add two focus regions from separate images, I noticed this: I had been visited by the Focus Ghost. You see, a simple averaging of two images from the same viewpoint with different depth of field results in the halo surrounding detailed parts from both images. This result was not acceptable.

But then I realized, that there was a novel (or not-so-novel) solution. For each mosaic or, in this case, image generate a focus mask that represents the detailed regions of it's image. I searched for an existing algorithm for my problem, but found none. It's all Depth from Defocus [2] nowadays.

Also, Paul Haeberli presented some results in A Multifocus Method for Controlling Depth of Field [5] where he used the difference of two images to create masks. This technique did not work for me. I found that steep gradients in areas of defocus registered as regions in focus.

The problem is a little more slippery than just edge detection, my first solution. After some experimentation, I was able generate a decent mask as follows:



The original images: a completely defocused image, image focus on the closest object and furthest object.

Perform a single pass point derivative on the images with focus information.

Threshold each image at the top five percent of its histogram. I need to play with that percentage a little.

Blow up each resulting pixel in the image (I need to play with this value as well). I eyed the pixel scaling value here, to more than adequately mask over the circles of confusion, the blurring resulted from defocused points from an image [3].

Here we contrast the image created from simple pixel averaging, to the new image where points that fall within the masked region are weighted much heavier (there is no feathering here). You can probably also see that the mask for this image picked up extra information.

Here we demostrate images generated near the depth of focus that they were generated at. For the pixel averaging, I used exponential dropoff at the boundaries of the mask.

Here are images generated between the two images with focused data.
1 2
3 4
One point: There was an artifact generated by one of the masks. It's the black square in the middle of the shoe.

The interface I created. It works in near real-time -- it gets a little choked up when you move the slider too fast. "B" is toward the back of the image, "F" is the front. After all the masks are created, input is taken from the slider and used to create an arbitrary view between two images and the defocused image.


Conclusion

My novel method for determining focused regions in an image approximates the blurred region that results from defocused point in the scene projected onto the image plane. By replacing those blurred regions with detailed focus information, a deeper depth of field in a single image can be created; and thereby, better approximate an entire scene (or allow images to be rendered at arbitrary depth of field). Generating new images does not simply mean copying the detailed information from images. The blurred areas of an image must be replaced if they were blurred by the information we're replacing.

I was not able to get the registration working. I am not pleased with that. I am also not pleased with the blending of the masked information. You see in image 3, from above, how detailed information tries to blend with completely blurred information. The contrast is visible because the blurred regions contain color information from the wall and desk. This contrast is not visible for images without color information [5].



Bibliography:

[1] R. Szeliski, H. Shum. Creating Full View Panoramic Image Mosaics and Environment Maps. ICCG, 1997.
[2] M. Watanabe and S. K. Nayar. Rational Filters for Passive Depth from Defocus.IJCV,1997.
[3] M. Potmesil, I. Chakravarty. A Lens and Aperture Camera Model for Synthetic Image Generation. SIGGRAPH, V15, No. 3. August, 1981.
[4] A. Krishnan, Ahuja. Range Estimation from Focus Using a Non-frontal Imaging Camera. IJCV, 20/3, p. 165-185.
[5] Paul Haeberli, http://www.sgi.com/grafica/depth/index.html, October 1996.