Advances in machine learning in the last decade have led to `digital intelligence', i.e. machine learning models capable of learning from vast amounts of labeled data to perform several digital tasks such as speech recognition, face recognition, machine translation and so on. The goal of this thesis is to make progress towards designing algorithms capable of `physical intelligence', i.e. building intelligent autonomous navigation agents capable of learning to perform complex navigation tasks in the physical world involving visual perception, natural language understanding, reasoning, planning, and sequential decision making. In the first part of the talk, I will discuss our past work on short-term navigation tackling challenges such as obstacle avoidance, semantic perception, language grounding, and reasoning. In the second part, I will discuss our work on long-term navigation tackling challenges such as localization, mapping, long-term planning, and exploration. And finally, I will discuss the goals of this thesis which include building long-term navigation models capable of both spatial and semantic understanding as well as using domain adaptation techniques to transfer these models from simulation to real robots.
Ruslan Salakhutdinov (Chair)
Jitendra Malik (University of California Berkeley)
Zoom Participation Enabled. See announcement for details.