Nick Rhinehart

, PhD Candidate, CMU



Navigation & Links



About Me

I'm a Ph.D Candidate at the Robotics Institute within the School of Computer Science at Carnegie Mellon University.

"How should we learn, interpret, quantify, and leverage models that reason about the future?"

Towards this question and others, I work on Reinforcement Learning and Imitation Learning methods at the interface of Computer Vision and Machine Learning. I'm specifically interested in building decision-theoretic models that leverage rich perception sources to drive activity forecasting, functional understanding, general prediction, and general control tasks. My research interests include forward and inverse reinforcement learning, imitation learning, activity analysis, generative modeling, egocentric vision, and recognition. I currently collaborate with Kris Kitani, Sergey Levine, Paul Vernaza, and Drew Bagnell.

In the past, I've worked with Paul Vernaza and Manmohan Chandraker at NEC Labs America, Yoichi Sato and Ryo Yonetani at The University of Tokyo, and Drew Bagnell at Uber ATG. I graduated from Swarthmore College with a degree in CS and a degree in Engineering. At Swarthmore I worked with Matt Zucker.


I am currently on the faculty market.


News



Publications


Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information
Mohit Sharma, Arjun Sharma, Nicholas Rhinehart, K. M. Kitani

ICLR 2019


Learning a single policy from demonstration for a complex, hierarchical task can be challenging. We propose a new algorithm based on the generative adversarial imitation learning framework which automatically learns sub-task policies from unsegmented demonstrations. Our approach maximizes the directed information flow in the graphical model between sub-task latent variables and their generated trajectories. We also show how our approach connects with the existing Options framework, which is commonly used to learn hierarchical policies.

show abstract • show bib • arXiv • openreview


      
    

Lorem Ipsum
      

Deep Imitative Models for Flexible Inference, Planning, and Control
N. Rhinehart, R. McAllister, S. Levine

arXiv 2018


We aim to combine the benefits of imitation learning and model-based reinforcement learning (MBRL), and propose imitative models: probabilistic predictive models able to plan expert-like trajectories to achieve arbitrary goals. We find this method substantially outperforms both direct imitation and MBRL in a simulated autonomous driving task, and can be learned efficiently from a fixed set of expert demonstrations without additional online data collection. We also show our model can flexibly incorporate user-supplied costs as test-time, can plan to sequences of goals, and can even perform well with imprecise goals, including goals on the wrong side of the road.

show abstract • show bib • arXiv • project page • youtube


			
		      

Lorem Ipsum
			

First-Person Activity Forecasting from Video with Online Inverse Reinforcement Learning
N. Rhinehart, K. Kitani

TPAMI 2018


We address the problem of incrementally modeling and forecasting long-term goals of a first-person camera wearer: what the user will do, where they will go, and what goal they seek. In contrast to prior work in trajectory forecasting, our algorithm, DARKO, goes further to reason about semantic states (will I pick up an object?), and future goal states that are far in terms of both space and time. DARKO learns and forecasts from first-person visual observations of the user's daily behaviors via an Online Inverse Reinforcement Learning (IRL) approach. Classical IRL discovers only the rewards in a batch setting, whereas DARKO discovers the transitions, rewards, and goals of a user from streaming data. Among other results, we show DARKO forecasts goals better than competing methods in both noisy and ideal settings, and our approach is theoretically and empirically no-regret.

show abstract • show bib • get TPAMI pdf • conference pdf • project page • IEEE


			    
			  

Lorem Ipsum
			    

R2P2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting
N. Rhinehart, K. M. Kitani, P. Vernaza

ECCV 2018


We propose a method to forecast a vehicle's ego-motion as a distribution over spatiotemporal paths, conditioned on features (e.g., from LIDAR and images) embedded in an overhead map. The method learns a policy inducing a distribution over simulated trajectories that is both diverse (produces most paths likely under the data) and precise (mostly produces paths likely under the data). This balance is achieved through minimization of a symmetrized cross-entropy between the distribution and demonstration data. We propose concrete policy architectures for this model, discuss our evaluation metrics relative to previously-used metrics, and demonstrate the superiority of our method relative to state-of-the-art methods in both the KITTI dataset and a similar but novel and larger real-world dataset explicitly designed for the vehicle forecasting domain.

show abstract • show bib • pdf • supplement • project page • dataset soon  • talk • short video • poster


			
		      

Lorem Ipsum
			
			
			

Learning Neural Parsers with Deterministic Differentiable Imitation Learning
T. Shankar, N. Rhinehart, K. Muelling, K. M. Kitani

CORL 2018


We pose the segmentation problem as an imitation learning problem by using a segmentation algorithm in the place of an expert, that has access to a small dataset with known foreground-background segmentations. We introduce a novel deterministic policy gradient update, DRAG, in the form of a deterministic actor-critic variant of AggreVaTeD, to train our neural network based object parser. We will also show that our approach can be seen as extending DDPG to the Imitation Learning scenario. Training our neural parser to imitate the oracle via DRAG allow our neural parser to outperform several existing imitation learning approaches.

show abstract • show bib • arxiv • code


			  
		        

Lorem Ipsum
		          

Human-Interactive Subgoal Supervision for Efficient Inverse Reinforcement Learning
X. Pan, E. Ohn-Bar, N. Rhinehart, Y. Xu, Y. Shen, K. M. Kitani

AAMAS 2018


We analyze the benefit of incorporating a notion of subgoals into Inverse Reinforcement Learning (IRL) with a Human-In-The-Loop (HITL) framework. The learning process is interactive, with a human expert first providing input in the form of full demonstrations along with some subgoal states. These subgoal states defines a set of sub-tasks for the learning agent to complete in order to achieve the final goal. We demonstrate that subgoal-based interactive structuring of the learning task results in significantly more efficient learning, requiring only a fraction of the demonstration data needed for learning the underlying reward function with a baseline IRL model.

show abstract • show bib • arxiv


		      
		    

Lorem Ipsum
		      

N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning
A. Ashok, N. Rhinehart, F. Beainy, K. Kitani

ICLR 2018


Conventional model compression methods modify the architecture manually or using pre-defined heuristics. We introduce a principled method for learning reduced network architectures with reinforcement learning. Our experiments show that we can achieve compression rates of more than 10x for models such as ResNet-34 while maintaining similar performance to the input `teacher' network. We also present a valuable transfer learning result which shows that policies which are pre-trained on smaller `teacher' networks can be used to rapidly speed up training on larger `teacher' networks.

show abstract • show bib • arxiv • openreview • code


		      
		    

Lorem Ipsum
		      

Predictive-State Decoders: Encoding the Future Into Recurrent Neural Networks
N. Rhinehart*, A. Venkataraman*, W. Sun, L. Pinto, M. Hebert, B. Boots, K. Kitani, J. A. Bagnell

NIPS 2017


RNNs are used to model dynamic processes that are characterized by underlying latent states whose form is often unknown, precluding its analytic representation inside an RNN. In the Predictive-State Representation (PSR) literature, latent state processes are modeled by an internal state representation that directly models the distribution of future observations, and most recent work in this area has relied on explicitly representing and targeting sufficient statistics of this probability distribution. We seek to combine the advantages of RNNs and PSRs by augmenting existing state-of-the-art recurrent neural networks with Predictive-State Decoders (PSDs), which add supervision to the network's internal state representation to target predicting future observations. PSDs are simple to implement and easily incorporated into existing training pipelines via additional loss regularization. We demonstrate the effectiveness of PSDs with experimental results in three different domains: probabilistic filtering, Imitation Learning, and Reinforcement Learning. In each, our method improves statistical performance of state-of-the-art recurrent baselines and does so with fewer iterations and less data.

show abstract • show bib • arxiv


		        
		      

Lorem Ipsum
		        

First-Person Activity Forecasting with Online Inverse Reinforcement Learning
N. Rhinehart, K. Kitani

ICCV 2017

Marr Prize (Best Paper) Honorable Mention Award.

We address the problem of incrementally modeling and forecasting long-term goals of a first-person camera wearer: what the user will do, where they will go, and what goal they seek. In contrast to prior work in trajectory forecasting, our algorithm, DARKO, goes further to reason about semantic states (will I pick up an object?), and future goal states that are far in terms of both space and time. DARKO learns and forecasts from first-person visual observations of the user's daily behaviors via an Online Inverse Reinforcement Learning (IRL) approach. Classical IRL discovers only the rewards in a batch setting, whereas DARKO discovers the transitions, rewards, and goals of a user from streaming data. Among other results, we show DARKO forecasts goals better than competing methods in both noisy and ideal settings, and our approach is theoretically and empirically no-regret.

show abstract • show bib • project page • recorded talk • MaxEnt code • get TPAMI pdf • conference pdf


			    
			  

Lorem Ipsum
			    
Learning Action Maps of Large Environments Via First-Person Vision
N. Rhinehart, K. Kitani

CVPR 2016


Our goal is to automate dense functional understanding of large spaces by leveraging sparse activity demonstrations recorded from an ego-centric viewpoint. We demonstrate that by capturing appearance-based attributes of the environment and associating these attributes with activity demonstrations, our proposed mathematical framework allows for the prediction of Action Maps in new environments. Additionally, we offer a preliminary glance of the applicability of Action Maps by demonstrating a proof-of-concept application in which they are used in concert with activity detections to perform localization.

show abstract • show bib • pdf • slides • arxiv • IEEE


				  
			      

@InProceedings{Rhinehart2016CVPR,
  author = {Rhinehart, Nicholas and Kitani, Kris M.},
  title = {Learning Action Maps of Large Environments via First-Person Vision},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2016}
  } 

Visual Chunking: A List Prediction Framework for Region-Based Object Detection
N. Rhinehart, J. Zhou, M. Hebert, J. A. Bagnell

ICRA 2015


We designed an imitation learning approach for the task of sequential region-based object detection, whereas many other approaches resort to ad hoc procedures (e.g. NMS) for filtering independent detections. We present an efficient algorithm with provable performance for building a high-quality list of detections from any candidate set of region-based proposals. We also develop a simple class-specific algorithm to generate a candidate region instance in near-linear time in the number of low-level superpixels that outperforms other region generating methods. We demonstrate that our new approach outperforms sophisticated baselines on benchmark datasets.

show abstract • show bib • pdf • poster (key) • poster (pdf) • youtube


				 
			      

@inproceedings{rhinehart2015visual,
  title={Visual chunking: A list prediction framework for region-based object detection},
  author={Rhinehart, Nicholas and Zhou, Jiaji and Hebert, Martial and Bagnell, J Andrew},
  booktitle={Robotics and Automation (ICRA), 2015 IEEE International Conference on},
  pages={5448--5454},
  year={2015},
  organization={IEEE}
}


Unrefereed Research


Flight Autonomy in Obstacle-Dense Environments
N. Rhinehart, D. Dey, J. A. Bagnell
Robotics Institute Summer Scholars Symposium, August 2011;
Sigma-Xi Research Symposium, October, 2011
poster (pdf) • youtube


Other Unrefereed Projects


Fast SFM-Based Localization of Temporal Sequences and Ground-Plane Hypothesis Consensus
Project for 16-822 Geometry Based Methods in Computer Vision, May, 2015
pdf • video (mp4)
Online Anomaly Detection in Video
Project for 16-831 Statistical Techniques in Robotics, December, 2014
pdf
Autonomous Localization and Navigation of Humanoid Robot
Swarthmore College Senior Thesis Project, May, 2012
pdf


Miscellaneous Projects

Miscellaneous old projects


Navigation & Links



© Nick Rhinehart