Vision and Autonomous Systems Seminar

  • Gates Hillman Centers
  • Traffic21 Classroom 6501
  • Ph.D. Candidate
  • Interactive Perception and Robot Learning Lab
  • Stanford University

Deep Learning for Understanding Dynamic Visual Data

Perceiving dynamic environments from visual inputs allows autonomous agents to understand and interact with the world and is a core topic in Artificial Intelligence. The success of deep learning motivates us to apply deep learning techniques to the perception of dynamic visual data. However, how to design and apply deep neural networks to effectively model the dynamic components of visual data and enable or improve various perception applications with motion cue are challenging questions. In this talk, I will present my past research on deep learning based machine perception of dynamic visual data to provide some answers to these questions.

I will start by introducing FlowNet3D, a deep neural network for estimating scene flow between two point clouds at consecutive timestamps in an end-to-end fashion. I will then present CPNet and MeteorNet, two deep learning backbone architectures for learning representations of RGB videos and 3D point cloud sequences respectively. I show their applications on action recognition, semantic segmentation, and scene flow estimation. Next, I will describe KeyPose, a deep learning architecture for estimating the 3D pose of an object based on keypoint locations, as well as a new dataset for transparent object study. In the end, I will discuss other potential application domains and directions for future research.

Xingyu Liu received his Electrical Engineering in Jan 2020 from Stanford University where he studies in the Interactive Perception and Robot Learning Lab (IPRL) headed by Prof. Jeannette Bohg. He has also collaborated with Prof. Leonidas Guibas at Stanford.  During his Ph.D., he spent time in research labs including Google Brain Robotics and Adobe Research.  Prior to Ph.D., he received an M.S. in Computer Science from Stanford University and B.Eng. in Electronic Engineering from Tsinghua University in China. His research focuses on computer vision and robotics.

Sponsored in part by the Facebook Virtual Reality Lab Pittsburgh

For More Information, Please Contact: