Lerrel Pinto

I am a second year PhD student at The Robotics Institute at Carnegie Mellon University advised by Abhinav Gupta. I recently concluded a wonderful internship at OpenAI with Wojciech Zaremba and Pieter Abbeel.

My research interests revolve around robotics, computer vision and big data. The focus is specifically on learning robotic tasks by leveraging large databases of robot information.


Recent Updates:

Press Coverage

Learning to Fly

Scaling up self supervised learning

Gizmodo Review SBS Review

Self supervised Grasping

MIT Tech Review Futurism Tech Review Gizmodo Review direct industry Tech Review Tech Xplore Review Tech Xplore Review
Research and Selected Projects


Asymmetric Actor Critic for Image-Based Robot Learning
Lerrel Pinto, Marcin Andrychowicz, Peter Welinder, Wojciech Zaremba, Pieter Abbeel
To appear in RSS 2018
arXiv, Video, Blog

Robotics poses many challenges for RL, most notably training on a physical system can be expensive and dangerous, which has sparked significant interest in learning control policies using a physics simulator. In this work, we exploit the full state observability in the simulator to train better policies which take as input only partial observations (RGBD images). We do this by employing an actor-critic training algorithm in which the critic is trained on full states while the actor (or policy) gets rendered images as input. We show experimentally on a range of simulated tasks that using these asymmetric inputs significantly improves performance. Finally, we combine this method with domain randomization and show real robot experiments for several tasks like picking, pushing, and moving a block. We achieve this simulation to real world transfer without training on any real world data.


CASSL: Curriculum Accelerated Self-Supervised Learning
Adithyavairavan Murali, Lerrel Pinto, Dhiraj Gandhi, Abhinav Gupta
To appear in ICRA 2018
arXiv, Video

Scaling the self supervised learning framework for high-dimensional control requires either scaling up the data collection effort or using a clever sampling strategy for training. We present a novel approach to train policies that map visual information to high-level, higher-dimensional action spaces and apply to grasping using an adaptive underactuated gripper.



Learning to Fly by Crashing
Dhiraj Gandhi, Lerrel Pinto, Abhinav Gupta
IROS 2017
arXiv, Video

How do you learn to navigate an Unmanned Aerial Vehicle (UAV) and avoid obstacles? In this paper, we propose to bite the bullet and collect a dataset of crashes itself! We build a drone whose sole purpose is to crash into objects: it samples naive trajectories and crashes into random objects. We crash our drone 11,500 times to create one of the biggest UAV crash dataset. We show that simple self-supervised models learnt on this data is quite effective in navigating the UAV even in extremely cluttered environments with dynamic obstacles including humans.


Robust Adversarial Reinforcement Learning
Lerrel Pinto, James Davidson, Rahul Sukthankar, Abhinav Gupta
ICML 2017
arXiv, Video

This paper proposes the idea of robust adversarial reinforcement learning (RARL), where we train an agent to operate in the presence of a destabilizing adversary that applies disturbance forces to the system. Extensive experiments in multiple environments (InvertedPendulum, HalfCheetah, Swimmer, Hopper and Walker2d) conclusively demonstrate that our method (a) improves training stability; (b) is robust to differences in training/test conditions; and c) outperform the baseline even in the absence of the adversary.



Supervision via Competition: Robot Adversaries for Learning Tasks
Lerrel Pinto, James Davidson and Abhinav Gupta
ICRA 2017
arXiv, Video, Grasp Detection Code

Due to large number of experiences required for robot learning, recent approaches use a self-supervised paradigm: using sensors to measure success/failure. However, in most cases, these sensors provide weak supervision at best. In this work, we propose an adversarial learning framework that pits an adversary against the robot learning the task.

In an effort to defeat the adversary, the original robot learns to perform the task with more robustness leading to overall improved performance. By grasping 82% of presented novel objects compared to 68% without an adversary, we demonstrate the utility of creating adversaries. We also demonstrate that having robots in adversarial setting might be a better learning strategy as compared to having collaborative multiple robots.


Learning to Push by Grasping: Using multiple tasks for effective learning
Lerrel Pinto and Abhinav Gupta
ICRA 2017
arXiv, Video

End-to-end learning approaches are often critiqued due to their huge data requirements for learning a task. But do end-to-end approaches need to learn a unique model for every task?

In this paper, we attempt to take the next step in data-driven end-to-end learning frameworks: move from the realm of task-specific models to joint learning of multiple robot tasks. In an surprising result we show that models with multi-task learning tend to perform better than task-specific models trained with same amounts of data.


The Curious Robot: Learning Visual Representations via Physical Interactions
Lerrel Pinto, Dhiraj Gandhi, Yuanfeng Han, Yong-Lae Park and Abhinav Gupta
Spotlight Presentation at ECCV 2016

What is the right supervisory signal to train visual representations? In case of biological agents, visual representation learning does not require semantic labels. In fact, we argue that biological agents use active exploration and physical interactions with the world to learn visual representations unlike current vision systems which just use passive observations (images and videos downloaded from web). Towards this goal, we build one of the first systems on a Baxter platform that pushes, pokes, grasps and actively observes objects in a tabletop environment.

We use these physical interactions to collect more than 130K datapoints, with each datapoint providing backprops to a shared ConvNet architecture allowing us to learn visual representations. We show the quality of learned representations by observing neuron activations and performing nearest neighbor retrieval on this learned representation. Finally we evaluate our learned ConvNet on different image classification tasks.


Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700 Robot Hours
Lerrel Pinto and Abhinav Gupta
Best Student Paper Award at ICRA 2016
arXiv / raw data / patch dataset


Two Stage GP-UCB for Open Loop Grasping
Lerrel Pinto guided by Siddhartha Srinivasa
Course Project for Manipulation Algorithms (Fall 2015)
pdf / video

In this project I attempt to learn open loop grasping parameters by implementing a two stage Gaussian Process (GP) based Upper Confidence Bound (UCB) bandit solving algorithm. Apart from showing successful grasp policies learnt from trial and error interactions, I present a method of motor grasp verification using soft fingers via 'tandem grasp'. I also present a brief analysis of the proposed bandit algorithm.


Image2Vec: Learning word and image representations for reasoning
Gunnar A. Sigurdsson and Lerrel Pinto
Course Project for Learning Based Methods in Vision (Spring 2015)

In this project, we learn a joint word and image space embedding. This would enable us to both predict image descriptions given an image and predict an image given an image description.These models are solved using Block coordinate descent, Stochastic gradient descent and Orthogonal Matching Pursuit. We further show image summarization results using weak tags from Flickr.


Finding minimum energy trajectories of a two linked pendulum
Lerrel Pinto
Course Project for Math Fundamentals in Robotics (Fall 2014)
pdf / slides / video

Given an end effector start and goal position, I find a trajectory that requires zero energy/torque to execute. All the energy supplied is in the form of initial angular velocity. This is done via a greedy search through a database of zero energy trajectories generated by exploiting the chaotic dynamics of the manipulator.


Development of a Segway Robot for an Intelligent Transport System
Lerrel Pinto, Dong-Hyung Kim, Ji Yeong Lee, and Chang-Soo Han
2012 IEEE/SICE International Symposium on System Integration (SII)


Delay Handling for an Adaptive Control of a Remotely Operated Robotic Manipulator
Lerrel Pinto, Krishanu Roy, Bhaben Kalita and S.K. Dwivedy
Proceedings of the 1st International and 16th National Conference on Machines and Mechanisms (iNaCoMM2013)

Research on time delayed control of Robotic arms and studies on the effects of varying time delays on effectiveness of such robots. Designed and developed a system to control robots at IITG's Virtual Robotics Lab remotely via the internet.


Design and Development of a Walking Apparatus
Lerrel Pinto and Shreeyash Lalit
(Bachelor Thesis Project advised by S.K. Dwivedy)
Short Paper / Full Report / Video

Designed, fabricated and controlled an autonomous mid-sized all terrain quadruped. Novelty in cost effectiveness (under 1000$) and energy efficiency (20W). Further contributions in the passive dynamics and developing energy efficient swing leg control for quadrupedal robots.

Source stolen from here