It seems attractive to reduce visual understanding to a classification problem that is solved by "big data". For example, one may label an image as a car versus person by comparing to a large training set.
However, understanding involves more than just classification. In this talk, I will focus on recognition systems that generate descriptive reports - how many people/cars are present? What are their 3D shapes?
Where will they be in the near future?
One challenge in designing such analytic systems is the poverty of the stimulus: there exists a "long tail" of rare scenes that have never before been seen. Classic approaches address this issue through three-dimensional geometric models that explicitly synthesize novel object shapes and scenes. Contemporary methods learn statistical classifiers from large collections of training data. In this talk I will discuss an approach that combines these schools of thought through latent-variable statistical models that synthesize new objects/scenes during recognition. The resulting systems produce state-of-the-art performance on a variety of established benchmark tasks including object detection, human analysis, and spatiotemporal tracking.
Deva Ramanan is an associate professor of Computer Science at the University of California at Irvine. Prior to joining UCI, he was a Research Assistant Professor at the Toyota Technological Institute at Chicago. He received his B.S. in computer engineering from the University of Delaware in 2000, graduating summa cum laude. He received his Ph.D. in Electrical Engineering and Computer Science from UC Berkeley in 2005 under the supervision of David Forsyth. His research interests span computer vision, machine learning, and computer graphics, with a focus on visual recognition.
Ramanan was awarded the David Marr Prize in 2009, the PASCAL VOC Lifetime Achievement Prize in 2010 an NSF Career Award in 2010, the UCI Chancellor's Award for Excellence in Undergraduate Research in 2011, the Outstanding Young Researcher in Image and Vision Computing Award in 2012, and was selected as one of Popular Science's Brilliant 10 researchers in 2012. His work is supported by NSF, ONR, DARPA, as well as industrial collaborations with the Intel Science and Technology Center for Visual Computing, Google Research, and Microsoft Research. He has held visiting researcher positions at the Robotics Institute at CMU, the Visual Geometry Group at Oxford, and has been a consultant for Microsoft and Google. He is on the editorial board of the International Journal of Computer Vision (IJCV) and is an associate editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). He regularly serves as a senior program committee member for the IEEE Conference of Computer Vision and Pattern Recognition (CVPR), International Conference on Computer Vision (ICCV), and the European Conference on Computer Vision (ECCV). He also regularly serves on NSF panels for computer vision and machine learning.
Faculty Host: Martial Hebert
khibner [atsymbol] cs ~replace-with-a-dot~ cmu ~replace-with-a-dot~ edu