15-463 Project 2: Seam Carving

With all of the devices in the market today, we have a considerable number of screen sizes with widely varying resolutions. Simply resizing content to fit a screen may be effective for some scenarios, but for many pictures scaling or cropping does not suffice. Shai Avidan and Ariel Shamir introduced a novel approach for retargeting images, seam carving, in their SIGGRAPH 2007 paper. Seam carving is an attempt to resize an image's height or width by removing contiguous horizontal or vertical strips of pixels with the lowest energy. Choice of how energy is defined over a given image affects the energy of the seams within an image and thus the order of pixels removed.

My approach to writing a seam carving application was to create an interactive application with the multi-scale solution found in the paper. I chose to write my application in Python for the ability to easily create a resizable window that can display an image with resize events. When loading my application with an image, it will compute the lowest cost seam in either the horizontal or vertical direction, remove the seam and recurse over the resulting image until all seams have been removed. Once these operations compelte, the appliation has generated a horizontal or vertical multi-scale map and restores the image. The window is then able to be resized by removing the seams quickly with one pass over the multi-scale map. For convienience, the multi-scale map is saved as a lossless PNG, which may be loaded with the original image instead of recomputing when the application is loaded.

The algorithm for carving a single seam is a very straightforward dynamic programming solution which calculates the minimal energy required to remove a single seam ending at any given pixel. Once the minimal energies are calculated, the algorithm looks at the minimal energy seam ending with the pixels at the bottom of the image and follows the path of the seam backwards through the dynamic programming table. The energy function used by my application simply calculates the gradient over an image, which works surprisingly well in certain scenarios.

My choice of result pictures determined how well my application worked. I decided to look for images with very high-frequency foregrounds or subjects and low-fequency backgrounds. For example, the Maltese dogs, the dog toy, the smoker and the landscape pictures I chose all fit this profile. The following are the original images and their multi-scape maps.

Maltese Puppies





Source

Dog Toy





Source

Smoker





Source

Grass and Sky





Source

Landscape




Source

Panorama




Source

Failures

As previously stated, the images above were chosen because they have very high frequency foregrounds and low frequency backgrounds. This choice plays nicely with the chosen gradient energy function. I also chose a football image, which I would hope by rescaling would push the players closer together. However, the high frequency of the grass and the low frequency of the player uniforms caused seam carving with the gradient energy function to have an adverse effect. Instead of removing the grass, certain parts of the image (such as the players' legs and the lines on the field) were removed instead.



Source

Another image which did not work very well was that of a widescreen monitor. My expectation when scaling such an image would be for the algorithm to remove parts of the screen (and not the stand) until a new aspect ratio is achieved. The gradient across the monitor caused the energy function to be very high and, as a consequence other pixels were removed before those in the middle of the screen. The straight lines are also not preserved since seams are 8-way continuous and not straight lines themselves.



Source

Some parts of my implementation did not work particularly well. By the nature of Python, accessing and modifying large multidimensional arrays is slow. For the smaller images, computing the maps took between 5 and 10 minutes. For the landscape image, which was purposefully chosen to be large, the horizontal and vertical multi-scale maps took a little over an hour each to compute. Migrating my code base to a compiled language, such as C or any of its derivatives, this application could have run in real-time without need to precompute the multi-scale maps. Additionally, it takes a significant amount of time to create an image in PIL and then display it with the Tk GUI so that even with the precomputation, the application is still rather sluggish.

Starry Night

Finally, I decided to try an image with very high frequency in the foreground and the background. Van Gogh's Starry Night was my first choice when searching for a picture with such properties. I expected the seams to be removed somewhat at random when seam carving, but the houses and the stars stayed in tact throughout most of the scaling.


;
{ }