Aayush Bansal

I received a PhD in Robotics from Carnegie Mellon University for my work on unsupervised learning of the 4D audio-visual world from sparse unconstrained real-world samples. During the course of my graduate studies, I received an Uber Presidential Fellowship for 2016-17 and was named a Presidential Fellow at CMU, a Qualcomm Fellowship for 2017-18, and a Snap Fellowship for 2019. I am also fortunate to have collaborated with many production houses in the last few years.

Email  /  CV  /  Google Scholar  /  GitHub  /  Thesis Talk


Task-Agnostic Exemplar Representations: I develop methods to learn a representation for a given signal in an unsupervised manner without knowing a task. Without any modification, the learned model drives a wide variety of downstream tasks. In my work, I have explored exemplar representations for various signals such as audio, video, and images.

My goal is to build artificial intelligence based on task-and-domain-agnostic exemplar representations learned in an unsupervised manner. Here are three on-going projects:
1. Robots: We are developing a new generation of robots that are not restricted to a particular task and can operate in any environment.
2. Commonsense Reasoning: A significant benefit of exemplar and task-agnostic unsupervised learning is to use the streams of audio-visual data in a never-ending manner. We are building an ever-expanding memex that will enable commonsense reasoning in artificial intelligence and robots.
3. Ultimate Communication Device: The major challenge in communication is to build the signal that genuinely represent our experiences. We are building intelligent systems that can enable us to reconstruct our experiences and share them.

Video-Specific Autoencoders for
Exploring, Editing and Transmitting Videos

Kevin Wang, Deva Ramanan, Aayush Bansal

project page / paper / summary / code

Unsupervised Audiovisual Synthesis via Exemplar Autoencoders
Kangle Deng, Aayush Bansal, Deva Ramanan
International Conference on Learning Representations (ICLR), 2021
project page / paper / open-review / code


Streaming Self-Training via
Domain-Agnostic Unlabeled Images

Zhiqiu Lin, Deva Ramanan, Aayush Bansal

project page / paper / arXiv


Stereo Radiance Fields:
Learning View Synthesis for Sparse Views of Novel Scenes

Julian Chibane, Aayush Bansal, Verica Lazova, Gerard Pons-Moll
Computer Vision and Pattern Recognition (CVPR), 2021
project page / paper / code

4D Visualization of Dynamic Events from
Unconstrained Multi-View Videos

Aayush Bansal, Minh Vo, Yaser Sheikh, Deva Ramanan, Srinivasa Narasimhan
Computer Vision and Pattern Recognition (CVPR), 2020
project page / paper / arXiv /summary / multi-view sequences
CMU Press Coverage

Shapes and Context : In-the-wild Image Synthesis & Manipulation
Aayush Bansal, Yaser Sheikh, Deva Ramanan
Computer Vision and Pattern Recognition (CVPR), 2019
(Oral Presentation, Best Paper Award Finalist)
project page / paper / arXiv / five minutes / CVPR Talk
web-app (beta version) / code / demo video

Recycle-GAN: Unsupervised Video Retargeting
Aayush Bansal, Shugao Ma, Deva Ramanan, Yaser Sheikh
European Conference on Computer Vision (ECCV), 2018
project page / pre-print / arXiv / code / one minute
CMU Press Coverage


PixelNN: Example-based Image Synthesis
Aayush Bansal, Yaser Sheikh, Deva Ramanan
International Conference on Learning Representations (ICLR), 2018
project page / arXiv / paper / codes


PixelNet: Representation of the pixels, by the pixels, and for the pixels.
Aayush Bansal, Xinlei Chen, Bryan Russell, Abhinav Gupta, Deva Ramanan
project page / arXiv / codes


Marr Revisited: 2D-3D Alignment via Surface Normal Prediction
Aayush Bansal, Bryan Russell, Abhinav Gupta
Computer Vision and Pattern Recognition (CVPR), 2016
project page /arXiv preprint / codes


Patch Correspondences for Intepreting Pixel-level CNNs
Victor Fragoso, Chunhui Liu, Aayush Bansal, Deva Ramanan


Mid-level Elements for Object Detection
Aayush Bansal, Abhinav Shrivastava, Carl Doersch, Abhinav Gupta
arXiv / analysis


Towards Transparent Systems: Semantic Characterization of Failure Modes
Aayush Bansal, Ali Farhadi, Devi Parikh
European Conference on Computer Vision (ECCV), 2014
project page / supplementary material
VT NEWS / Virginia Center for Autonomous Systems


Understanding How Camera Configuration and Environmental Conditions Affect Appearance-based Localization
Aayush Bansal, Hernan Badino, Daniel Huber
IEEE Intelligent Vehicles (IV), 2014
project page


Which Edges Matter?
Aayush Bansal, Adarsh Kowdle, Devi Parikh, Andrew Gallagher, Larry Zitnick
Workshop on 3D Representation and Recognition (3dRR) at ICCV, 2013
project page / presentation


CANINE : A robotic mine dog
B. A. Stancil, J. Hyams, J. Shelly, K. Babu, H. Badino, A. Bansal, D. Huber, P. Batavia
IS&T Conference on Electronic Imaging (SPIE), 2013
project page / video / competition rules

I wrote the object detection module for this robot.

New Art Display


Geometry-based Methods in Computer Vision (16-822), CMU
Teaching Assistant (TA) with Prof Martial Hebert
Fall 2017

Computer Vision (16-720), CMU
Teaching Assistant (TA) with Prof Srinivasa Narasimhan
Spring 2015


CMU AI Seminar Series
Smith Hall Messiest Desk Award
Unsung heroes in Science: The ones who I have read so far.

Thanks to Jon Barron for the webpage design!