AbstractPersonal Navigation Devices are useful for obtaining driving directions to new destinations, but they are not very intelligent -- they observe thousands of miles of preferred driving routes but never learn from those observations when planning routes to new destinations. Motivated by this deficiency, we present a novel approach for recovering from demonstrated behavior the preference weights that drivers place on different types of roads and intersections. The approach resolves ambiguities in inverse reinforcement learning (Abbeel and Ng 2004) using the principle of maximum entropy (Jaynes 1957), resulting in a probabilistic model for sequential actions. Using the approach, we model the context-dependent driving preferences of 25 Yellow Cab Pittsburgh taxi drivers from over 100,000 miles of GPS trace data. Unlike previous approaches to this modeling problem, which directly model distributions over actions at each intersection, our approach learns the reasons that make certain routes preferable. Our reason-based model is much more generalizable to new destinations and new contextual situations, yielding significant performance improvements on a number of driving-related prediction tasks.
This is joint work with Andrew Maas, Drew Bagnell, and Anind Dey.
Venue, Date, and Time
Venue: NSH 1507
Date: Monday, April 14
Time: 12:00 noon