Appearance Clustering for Scene Analysis

We propose a new approach called “appearance clustering” for scene analysis. The key idea in this approach is that the scene points can be clustered according to their surface normals, even when the geometry, material and lighting are all unknown. We achieve this by analyzing a continuous image sequence of a scene as it is illuminated by a smoothly moving distant source. Each pixel thus gives rise to a “continuous appearance profile” that yields information about derivatives of the BRDF with respect to source direction. This information is directly related to the surface normal of the scene point when the source path follows an unstructured trajectory (obtained, say, by “hand-waving”). Based on this observation, we transform the appearance profiles and propose a metric that can be used with any unsupervised clustering algorithm to obtain iso-normal clusters. We successfully demonstrate appearance clustering for complex indoor and outdoor scenes. In addition, iso-normal clusters serve as excellent priors for scene geometry and can strongly impact any vision algorithm that attempts to estimate material, geometry and/or lighting properties in a scene from images. We demonstrate this impact for applications such as diffuse and specular separation, both calibrated and uncalibrated photometric stereo of non-lambertian scenes, light source estimation and texture transfer.


"Clustering Appearance for Scene Analysis"
S.J. Koppal and S.G Narasimhan,
IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
June 2006.

"Appearance Derivatives for Iso-normal Clustering of Scenes"
S.J. Koppal and S.G Narasimhan,
IEEE Pattern Analysis and Machine Intelligence (PAMI),
August 2009.


"Clustering Appearance for Scene Analysis"
Oral Presentation at CVPR 2006: [PPT]


Experimental Setup:
This picture shows the apparatus used in our appearance clustering experiments. Our acquisition setup with a Canon XL2 video camera viewing a static scene, and a 60 watt incandescent light source attached to a wooden wand. Note that in real experiments we have the camera and light source much further away to satisfy the orthographic projection and distant light source assumptions.
Clustering materials in the CURET Database:
We acquired image sequences of the real sample materials by waving a light source (and did not use the still images distributed by Columbia University). Notice the top row containing materials such as artificial grass and straw and the middle row with examples of real wool and steel wool. Despite significant appearance differences, these samples cluster together accurately because they share the same surface normal.
Indoor Scenes:
Here are some clustering results for indoor scenes. Note that these are all lambertian scenes that contain materials such as wood, metal and reflective tile.
Photometric Stereo of Non-Lambertian Objects:
We extract the Lambertian terms from a scene and apply Photometric Stereo. Integrating the normals gives us 3D shape. We show two views of the structure of the cup. Our clustering and optimization allow algorithms that assume diffuse model to work with non-Lambertian objects. Here we use Hayakawa’s method ([9]) to get the 3D structure of the books and the corresponding lighting. Note that we obtain only the normals from photometric stereo, and we have to compute the book planes in an extra step.
Texture transfer:
Complex materials (such as satin and velvet) are transferred between similar surface normals in a scene. A patch of the original scene is chosen by the user and a simple repetitive texture synthesis method is used to transfer this patch onto other areas of the scene with the same surface normal. Note the consistency in geometry and lighting in the transferred regions.


(Video Result Playlist)
CVPR 2006 Video (use Apple Quicktime 6.0):
This video is a compilation of the main results of this project (30 MB).
Clustering Examples (use Apple Quicktime 6.0):
This video is a compilation of the clustering results on different scenes.
Applications (use Apple Quicktime 6.0):
We show different applications of our method including uncalibrated photometric stereo, texture transfer and separation of specular and diffuse video components.