Programming Assignment 1: IMAGE MOSAICING
15-869, Image-Based Modeling and Rendering
Due: Sunday, September 19, at midnight
Revision 2, Sept. 13, 1999
In this assignment you will take two or more photographs and
create an image mosaic by registering,
projective warping, resampling, and compositing them.
As background for this assignment, read
Projective Mappings for Image Warping,
Paul Heckbert, and
Image Mosaicing for Tele-Reality Applications,
Richard Szeliski, Digital CRL 94/2, 1994
(revised version appeared in
IEEE Computer Graphics & Appls, Mar. 1996).
Typically, I'd prefer that you work individually, but
if you would like to work in a team of two, speak to me.
The steps of the assignment are:
- shoot and digitize pictures
- register them
- warp and mosaic them
- submit your results
Shoot and Digitize Pictures
Shoot two or more photographs so that the transforms between them are
projective (a.k.a. perspective).
One way to do this is to shoot from the same point of view but with
different view directions, and with overlapping fields of view.
Another way to do this is to shoot pictures of a planar surface
(e.g. a wall) from different points of view.
The options for digitization may dictate your choice of camera:
We're not particular about how you take your pictures or get them
into the computer, but we recommend:
use yours, borrow one from us or a friend.
It may be harder for you to get one of these cameras,
and typically their spatial resolution isn't as high as you'd
get with photographic film,
but once you've got the camera,
you get instant feedback on the pictures you've taken,
and getting them on-line
is easy, so this is the recommended method.
We recommend you use a resolution of at least 640x480 pixels,
and capture and save the pictures in the highest quality mode available.
For the cameras we can loan, a machine running Windows 9x/NT is required to
transfer the pictures after shooting.
Photo CD from slide or print film.
You can shoot on standard photographic film and most photo labs can
have your images digitized onto a
Photo CD or Picture CD.
These are two different CD ROM formats for color images.
Either is quite easy to manipulate with
image processing programs such as Photoshop.
Call a photo lab to get prices.
This is a good route if you want top quality and you're going to be
shooting more than 10 images, say, but it's more expensive than
the options below.
Scan color prints.
You could shoot standard color print film, get it developed,
and then scan the images using the scanners in
For CS grad students, your NC 58 key will get you in.
When I last used the scanner there, the procedure was:
Put photo in Silverscan flatbed scanner;
on Mac, run Photoshop;
on menu: Acquire, Silverscan;
scan at 100-200 pixels per inch (recommended), if 4"x6" prints,
if more, the files will be huge;
crop if desired;
write picture to file (TIFF format is good).
It is important that you transfer all of your picture files to
another machine (e.g. by Appleshare)
and clean up after yourself, since these are public
Scan color slides.
This is probably less desirable.
There is a slide scanner in the Robotics Graphics Deli in Smith Hall.
Ask Debra Tobin if you can use it.
Avoid fisheye lenses or lenses with significant barrel distortion
(do straight lines come out straight?).
Any focal length is ok in principle, but wide angle lenses often make
more interesting mosaics.
Shoot as close together in time as possible, so your subjects don't
move on you, and lighting doesn't change too much.
Use identical aperture & exposure time, if possible.
On most "idiot cameras" you don't have control of this, unfortunately.
It's nice to use identical exposures so that the images will have identical
brightness in the overlap region.
Overlap the fields of view significantly.
20% to 50% overlap is recommended.
Too little overlap makes registration harder.
It's OK to vary the zoom between pictures.
You can use pictures that you shot specially for this class,
or pictures you shot previously.
If you can't use your own pictures, talk to Paul.
If you're shooting a non-planar scene, then
shoot pictures from the same position (turn camera, but don't translate it).
A tripod can help in this, particularly if objects are close.
Good scenes are: building interiors with lots of detail, inside a
canyon or forest, tall waterfalls, panoramas.
The mosaic can extend horizontally, vertically, or can tile a
You might want to shoot several such image sets and choose the best.
Shoot & digitize your pictures early - leave time to re-shoot in case they
don't come out!
Print and lay out your photos on a table to see approximately what the
mosaic will look like.
To register the images,
for each pair of overlapping images,
find four or more corresponding points,
get their coordinates, and compute the transformation from one image
to the other.
We recommend that you use the starter code provided
in the class's
Currently, this program loads a single image and allows you to click
on points and
it prints pixel coordinates.
It's written in C++, it uses OpenGL for graphics, and FLTK for the
We suggest that you modify it and
create a program that loads multiple pictures (side-by-side or sequentially),
permits you to tap out four or more corresponding points for each pair,
and solves the linear system of equations for the coefficients
of the projective
Tips: If you permit the image to be zoomed up, then users will be able
points with fractional pixel precision, thereby improving the registration.
Placing the digitization points as far apart as possible will improve accuracy.
If you digitize n points, you'll be solving a system
of 2n equations in 8 unknowns.
paper mentioned earlier.
If n=4 then you can turn it into an 8x8 linear system of equations.
If n>4 then the system is overdetermined, and requires more work to solve.
(In the overdetermined case,
depending on whether you work with the rational
or linearized versions of the equations, given in the above paper,
you get either a nonlinear or linear overdetermined system of
equations, which can be solved by least squares methods.
The nonlinear approach probably would give more accurate registration,
but the linear approach is far easier to implement using the
``normal equations'' described in any linear algebra textbook.)
You can find code to solve a linear system in the VL library
(discussed in the
course software web page
in the Numerical Recipes book, or numerous other sources.
Any reasonably accurate method should suffice here.
The math for projective image warps will be covered in lecture.
Warp and Mosaic the Pictures
Warp the images so they're registered and create an image mosaic.
Instead of having one picture overwrite the other, which would lead
to strong mosaic artifacts,
use weighted averaging.
You can leave one image unwarped and warp the other image(s) into its
projection, or you can warp all images into a new projection; it's up to you.
You'll probably need to transform some picture corners and
find a bounding box to determine how big your output image will be.
From there, the recommended algorithm is to scan out those pixels
in scanline order.
For each output pixel,
loop over all input pictures and
From these weights and colors,
compute the weighted average at each output pixel
and write it out (if the registration is perfect,
the image exposures are identical,
there is no vignetting, and the scene was static, the colors being averaged
here will be equal, but life is rarely that simple).
projectively transform that point into the coordinate system of that
compute a weight for that picture (zero if outside the picture or at edge,
peak weight at center)
if the weight is positive, interpolate a color from the input picture
using bilinear interpolation (if you use point sampling instead of
some form of interpolation, your warped pictures will look jaggy)
Bugs in the bilinear interpolation code are common
because they're easily overlooked.
(It's recommended that you test this code by doing a special test
where you zoom up a picture by a factor of 10.
The resulting picture should have no step discontinuities in intensity.)
Try to get the registration and compositing working well enough that the
seams become invisible.
If the exposures or processing of your pictures differ markedly, you may
find it helpful to do some color correction (use Photoshop, say).
Vignetting (darkening on edges of image due to the lens optics)
can also be a slight problem.
Although you could get results close to this using OpenGL texture mapping,
we want you to write code to do the warping and compositing yourself
for this assignment.
If your mosaic spans more than 180 degrees, you'll need to break your
mosaic into pieces (as in cubical environment maps), or else use non-projective
mappings, e.g. spherical or cylindrical projection.
Submit Your Results
Put your code and executable in
and put your best pictures and a web page explaining and displaying them in a
subdirectory asst1/www .
(as of 9/15, the student directories have been created).
If you didn't get a directory, but need one, send email to ph@cs.
The asst1 directory will be private, while the asst1/www directory
will be public
(to permit students to view each others' results), so the latter should not
contain code, and it should not contain links to private files.
In asst1/www/index.html, put a web page (HTML) containing
your original images, your final, best mosaic, and some explanation.
If you don't know HTML, examine the source to
this web page,
for example, to see how to do simple formatting and
High output resolution is desirable, but
for the web page, please zoom down your images to a width and height of no more
Converting your picture files from TIFF to JPEG will permit
Netscape or Explorer
to display them directly.
We recommend Photoshop or the UNIX program convert for this.
On this web page,
mark the exact correspondence points in the original images somehow
(e.g. bright red pixels, or little X's).
Include a few paragraphs explaining what you did:
what the pictured object is,
briefly how you shot the pictures and digitized them,
a sentence or two on what the user does to specify
how long the program took to run,
the best and worst aspects of your results,
and any unusual aspects of your solution
(e.g. "My program is written in matlab" or
"My output projection is cylindrical.")
Implement semi-automatic registration.
A good balance of interaction and automation would be to have the user
indicate approximate correspondence points (within a few pixels) and then
have the system do a local search within a small window
(at sub-pixel precision) for the point that minimizes
the sum of squared differences within a small window.
Or go further and try the method from section 4.1 of
If your image warping code is fast enough, you could redisplay as the user
"drags" and stretches the images.
Generate a panorama and put it in
Quicktime VR file format
so it can be displayed
using one of the optimized
Change log: 9/14: corrected section on overdetermined systems,
added link to my Projective Mappings paper.