We present two separate projects in this talk. The first takes a new look at edge detection using a data-driven learning approach. Edges arise from a wide variety of phenomena that makes hand-designing detectors difficult. Using structured random forests we obtain state-of-the-art edge detection results, while also achieving real-time performance. This is multiple orders of magnitude faster than competing approaches of similar quality. We also demonstrate that our learned edge models generalize well across datasets.
In the second part of our talk, we discuss some recent work on semantic scene understanding using abstract scenes. Abstract scenes or cartoons allow us to study high-level concepts in computer vision without the dependence on often noisy object, attribute and scene detectors. For instance, using a large database of scenes and sentences we can automatically infer the mapping between semantics and their visual interpretation. Another application is inferring object dynamics in scenes, i.e. understanding how objects interact temporally. We demonstrate how these interactions are dependent on a variety of factors, including object positions, motions and attributes.
C. Lawrence Zitnick is a senior researcher in the Interactive Visual Media group at Microsoft Research, Redmond. He is interested in a broad range of topics related to visual object recognition. His current interests include object detection, semantically interpreting visual scenes, and the use of human debugging to identify promising areas for future research in object recognition and detection. He developed the PhotoDNA technology used by Microsoft, Facebook and various law enforcement agencies to combat illegal imagery on the web. Before joining MSR, he received the PhD degree in robotics from Carnegie Mellon University in 2003. In 1996, he co-invented one of the first commercial portable depth cameras.
Faculty Host: Kris Kitani