I am primarily interested in applying Imitation Learning to complex robotic tasks. My current research is looking at learning to follow natural language directions through unknown environments. I have also investigated learning utility functions in multi-robot task allocation domains, and several other past projects.
At a higher level, I am broadly interested in field robotics problems and applications, especially those that require coordination between multiple agents.
Natural Language Direction Following
People and robots are increasingly working together in shared spaces, with common tasks and goals. This increases the need for Human-Robot Interaction, so that lay users can effortlessly control complex robots. Unconstrained natural language holds the promise to enable users to command robots in an intuitive and flexible way, without requiring specialized interfaces or training.
One instance of this problem is direction following through unknown environments. For example, a person may want to command a rescue robot to, "Turn right and go through the double doors into the lounge." Such commands are challenging because they include understanding diverse landmarks (e.g., "double doors" or "the lounge") and both actions and spatial relationships (e.g., "turn right," "go through," "into"). In addition, because the robot may only have access to partial information about the environment, the actions or landmarks required for execution may not have realizations when the robot begins to execute its actions (e.g., the double doors may not be immediately visible).
We address the problem of robots following natural language directions through complex unknown environments, which requires understanding the structure of language, mapping verbs and spatial relationships onto actions in the world, recognizing diverse landmarks located in the environment, as well as reasoning about the environment and landmarks that have not yet been detected.
By exploiting the structure of spatial language, we can frame direction following as a problem of sequential decision making under uncertainty. We learn a policy which predicts a sequence of actions that follow the directions using imitation learning and demonstrations of correct behavior. The policy learns to explore the environment (discovering landmarks), backtrack when necessary, and explicitly declare when it has reached its destination. By training explicitly in unknown environments, we can generalize to situations that have not been encountered previously.
Imitation Learning for Multi-Robot Task Allocation
At the heart of any multi-robot task allocation mechanism is utility, a unifying concept which represents an estimate of the system performance. This enables robots to compare different options and select the best, maximizing overall team performance. For example, a team of robots exploring an environment may wish to maximize the area observed while minimizing costs for traveling, or a team of searchers may want to maximize the likelihood of finding an evader while minimizing the time-to-capture. In domains such as these, the utility metric is simple to express, and can easily be derived from the high-level goals.
In complex environments, however, estimating utility is not so straightforward. Accounting for complex unmodeled elements in the world (such as deep underlying dynamics or an adversary team) may be intractable or impossible. However, a human expert may bring extensive domain knowledge and previous experience, or will be able to quickly gain an intuitive understanding of the domain. Though this knowledge may be hard to articulate into an explicit algorithm, policy, or utility mapping, the expert will generally be able to recognize a good solution.
We have developed a technique to harness the expert's intuition by applying imitation learning to the multi-robot task allocation domain. Using a market-based method, we steer the allocation process by biasing prices in the market according to a policy which we learn using a set of demonstrated allocations (the expert's solutions to a number of domain instances). This method is simple, requires no tuning of complicated utility functions, and can improve the performance of multi-agent teams in complex domains.
For more information on this topic see the publications page.
- Optimal path planning for multi-robot coverage We studied how to plan paths for covering an orange orchard with autonomous spray & inspection robots.
- WiFi localization using Gaussian Processes Work performed at CSIRO in Brisbane, Australia. more details.
- Cooperative Force Control of a Mobile Manipulator
- Robot Colony
- Mobile Robot Programming Lab