The appearance of an outdoor scene is determined to a great extent by the prevailing illumination conditions. A photographer’s quest for the perfect light is testament to the wide variety of beautiful effects that are created by natural light. But while it enables stunning photography, illumination creates challenging conditions for computer vision and graphics algorithms. How can we make computers understand and synthesize images when lighting variations change the actual pixel values so much? Because of this challenge, the vast majority of previous work has focused on developing invariants: representations that stay constant even in the presence of lighting changes.
In this talk, I will propose that we should depart from this invariance and instead advocate a new philosophy: we should embrace illumination! We should actively try to understand and exploit its effects, even in the challenging, uncontrolled world of consumer photographs.
In the first part of the talk, I will present a method for estimating the likely illumination conditions of the scene given a single outdoor image. In particular, we estimate the probability distribution over the relative sun position with respect to the camera. The method relies on a combination of weak cues that can be extracted from different portions of the image: the sky, the vertical surfaces, the ground, and the convex objects in the image. While no single cue can reliably estimate illumination by itself, each one can reinforce the others to yield a more robust estimate. We present both quantitative and qualitative results obtained on consumer-grade photographs downloaded online. Using the estimated illumination conditions, we can realistically insert appropriately-lit synthetic 3-D objects into the scene. In addition to single images, this method also works on time-lapse sequences. This work was done jointly with Alyosha Efros and Srinivas Narasimhan, at Carnegie Mellon.
In the second part of the talk, I will show how we can go even further and extract the entire illumination map from a single image by using Lightbrush™, a user-guided intrinsic imaging system that I helped develop during my stay at Tandent, Inc. I will also show an extension of that technology which enabled us to create the first (ever?) intrinsic videos. This work was done jointly with Andrew Stein while at Tandent, during a 1.5-year hiatus I spent in industry after graduation. I will also take the opportunity to share some of what I have learned about industrial research.
Jean-François Lalonde is currently a Post-Doctoral Associate at Disney Research, Pittsburgh. Previously, he was a Computer Vision Scientist at Tandent, Inc., where he researched computer vision technologies and helped develop LightBrush(tm), the first commercial intrinsic imaging application. He also introduced intrinsic videos at SIGGRAPH 2012, a new technology he developed during his tenure at Tandent. He received a B.Eng. degree in Computer Engineering with honors from Laval University, Canada, in 2004. He earned his M.S at the Robotics Institute at Carnegie Mellon University in 2006 under Prof. Martial Hebert and received his Ph.D., also from Carnegie Mellon, in 2011 under the supervision of Profs. Alexei A. Efros and Srinivasa G. Narasimhan. His thesis, titled "Understanding and Recreating Appearance under Natural Illumination," won the 2010-11 CMU School of Computer Science Distinguished Dissertation Award, and was partly supported by a Microsoft Research Graduate Fellowship. His work focuses on lighting-aware image understanding and synthesis by leveraging large amounts of data.
Catherine Copetas, copetas [atsymbol] cs.cmu.edu