Project Progress Report

Kiran Bhat

3D Mosaicing from Stereo Video

Project Goal:

I will develop an algorithm to register range images taken from a stereo
pair using image based techniques. Stereo data of a scene from different
viewpoints is obtained by moving around a caliberated camera pair. From
the stereo & image data, the system solves for the rigid body motion
between the various camera pair locations. This information can be used
to register all the range images wrt to a common reference frame. This
gives us a planar mosaic of the scene with disparity values at each pixel,
which can be used to render views from new viewpoints.

Previous Work:

Szeliski [CVPR 99] uses a similar formulation using multiple views
to obtain dense stereo depth maps. In his image mosaicing report [DEC 94],
Szeliski presents a formulation to recover projective depth and register
range images from an image sequence. In this work, he represents the depth
map using a tensor product spline and recovers depth estimates only at
the spline control vertices. Other related work in the rendering part includes
the Layered depth images [SIGGRAPH 98] by Shade et al, View Interpolation
[SIGGRAPH 93] by Chen & Williams and Plenoptic modeling [SIGGRAPH 95]
by McMillan and Bishop. The stereo algorithm that I am currently using
was developed by Zitnick and Kanade[CMU-RI-TR-99-35].

Details of the algorithm:

I model the transformation between two images taken from different
viewpoints using 11 parameters (8 parameter for the 2D homography and 3
parameters for the translation along the epipole direction). Note that
the projective depth at every pixel is known to us from the stereo data.
I use the Levenberg-Marquardt algorithm to compute these unknown parameters
by minimising the sum squared difference in the intensity values of the
corresponding pixels.

Novelty of the algorithm:

A new technique to register range images using image based techniques.

A different approach to solve structure from motion?

Testing the algorithm:

I have taken a few stereo sequences of the NSH common area and I will
try to mosaic the various images to produce a composite image mosaic with
range information at each pixel.