In complex dynamic multi-robot domains, we have a set of individual robots that must coordinate together through a team planner that inevitably makes assumptions based on probabilities about the state of world and the actions of the individuals. Eventually, the individuals may encounter failures, because the team planner’s models of the states and actions are incomplete. Further, the team planner may find that there is no plan with a reasonable probability of success and instead provides either no further course of action or a plan which is likely to fail. In this thesis, we address the problem of what an individual robot must do when faced with such failures and cannot execute the team plan generated by the team planner.

While previous work has exclusively explored centralized approaches or decentralized approaches for dynamic multi-robot problems, it lacks the combination of a centralized approach with intelligent planning individuals, whom are often found in decentralized approaches. In centralized approaches, the focus has been on removing the need for replanning through conditional planning and policy generations, on hierarchical decomposition to simplify the multi-robot problem, or on predicting the informational needs of teammates. In decentralized approaches, the focus has been on improving auctioning algorithms, task decomposition, task assignment, and policy generation. More generally, shared by both approaches, there has been work on commitment to actions shared between multiple robots, i.e., joint actions. In my thesis, I contribute a novel intra-robot replanning algorithm for the individual robots, within a team, that intelligently handle failures. I introduce team plan conditions in the team plan that provide the reasoning behind the assumptions made by the team planner. My novel intra-robot replanning algorithm has the choice of re-achieving the failed team plan conditions, invoking the team planner with more information, or changing the team plan, thereby changing the state of the world, to improve the probability of the team succeeding. Furthermore, my intra-robot replanning algorithm learns what choice should be taken by the individual robot in a given state of the world.

My thesis is motivated by my previous work with individual robot replanning and team planners and the inability of their individual robots to handle failures. With NASA's Astrobee, a floating zero-g robot that will operate in the International Space Station, its approach to failures was to stop, message the ground station, and wait for a new plan. With autonomous underwater vehicles (AUVs), their approach to failures was to rise to the surface of the ocean, message the centralized controller, and wait for a new plan. With the Small-Size League soccer robots, their approach to failures was to continue, blindly executing a failing team plan, because taking at least some course of action, even if currently failing, is better than doing nothing in an adversarial domain. Of course, continuing to execute a failing plan is an inadvisable approach, but the individual soccer robots lacked the necessary individual intelligence to fix the problem. All these domains share the common thread of relying on the centralized planner to provide them with a new solution to handle failures. And, in doing so, they each decrease their performance in completing their plans efficiently.

Our work so far has proposed a team plan representation that introduced team plan conditions. Team plan conditions provide the conditions required to accomplish the team's goal(s), assumed by the team planner to remain valid, and to provide the individual team members with the team planner's reasoning should they need to replan during execution. I demonstrated that proactively using my novel intra-robot replanning algorithm can improve the performance of the team and that learning has the potential to further improve the decision making of the individual robot.

The remainder of my thesis work will focus on formalizing the team plan representation to include team plan conditions, on generalizing the intra-robot replanning algorithm to clarify the function of the individual and of the team planner, and on learning that informs the individual of the best choice when faced with a failure. In evaluating my thesis work, I will formalize multiple different domains (e.g., robot soccer, coordinated underwater vehicles, and capture the flag) to show the improvement of a team of informed learning individual robots compared to a team of uninformed individual robots.
Thesis Committee:
Manuela Veloso (Chair)
Maxim Likhachev
Stephen Smith
Daniel Borrajo Millán (Universidad Carlos III de Madrid)

As machine learning becomes more ubiquitous, clustering has evolved from primarily a data analysis tool into an integrated component of complex machine learning systems, including those involving dimensionality reduction, anomaly detection, network analysis, image segmentation and classifying groups of data. With this integration into multi-stage systems comes a need to better understand interactions between pipeline components. Changing parameters of the clustering algorithm will impact downstream components and, quite unfortunately, it is usually not possible to simply back-propagate through the entire system. Currently, as with many machine learning systems, the output of the clustering algorithm is taken as ground truth at the next pipeline step. Our empirical results show this false assumption may have dramatic empirical impacts -- sometimes biasing results by upwards of 25%.

We address this gap by developing estimators and methods to both quantify and correct for clustering errors' impacts on downstream learners. Our work is agnostic to the downstream learners, and requires few assumptions on the clustering algorithm. Theoretical and empirical results demonstrate our methods and estimators are superior to the current naive approaches, which do not account for clustering errors.

Along these lines, we also develop several new clustering algorithms and prove theoretical bounds for existing algorithms, to be used as inputs to our later error-correction methods. Not surprisingly, we find learning on clusters of data is both theoretically and empirically easier as the number of clustering errors decreases. Thus, our work is two-fold. We attempt to both provide the best clustering possible and learn on inevitably noisy clusters.

A major limiting factor in our error-correction methods is scalability. Currently, their computational complexity is O(n3) where n is the size of the training dataset. This limits their applicability to very small machine learning problems. We propose addressing this scalability issue through approximation. It should be possible to reduce the computational complexity to O(p3) where p is a small fixed constant and independent of n, corresponding to the number of parameters in the approximation.
Thesis Committee:
Artur Dubrawski (Chair)
Geoff Gordon
Kris Kitani
Beka Steorts (Duke University)

Copy of Thesis Proposal

Machine learning models have led to remarkable progress in visual recognition. A key driving factor for this progress is the abundance of labeled data. Over the years, researchers have spent a lot of effort curating visual data and carefully labeling it. However, moving forward, it seems impossible to annotate the vast amounts of visual data with everything we wish to learn from it. This reliance on exhaustive labeling is a key limitation in the rapid development and deployment of computer vision systems in the real world. Our current systems also scale poorly to the large number of concepts and are passively spoon-fed supervision and data.
In this thesis, we explore methods that enable visual learning without exhaustive supervision. Our core idea is to model the natural regularity and repetition from the visual world in our learning algorithms as their inductive bias. We observe recurring patterns in the visual world -  a person always lifts their foot before taking a step, dogs are similar to other furry creatures than to furniture, etc. This natural regularity in visual data also imposes regularities on the semantic tasks and models that operate on it - a dog classifier must be similar to classifiers of furry animals than to furniture classifiers. We exploit this abundant natural structure or `supervision' in the visual world in the form of self-supervision for our models, modeling relationships between tasks and labels, and similarities in the space of classifiers. We show the effectiveness of these methods on both static images and videos across varied tasks such as image classification, object detection, action recognition, human pose estimation etc. However, all these methods are still passively fed supervision and thus lack agency: the ability to decide what information they need and how to get it. To this end, we propose having interactive learners that ask for supervision when needed and can also decide what samples they want to learn from.
Thesis Committee:
Martial Hebert (Co-chair)
Abhinav Gupta (Co-chair)
Deva Ramanan
Alexei A. Efros (University of California, Berkeley)
Andrew Zisserman (University of Oxford)

Copy of Thesis Proposal

As the use of robotic manipulation in manufacturing continues to increase, the robustness requirements for fastening operations such as screwdriving increase as well. To investigate the reliability of screwdriving and the diverse failure categories that can arise, we collected a dataset of screwdriving operations and manually classified them into stages and result categories. I will present the data collection process, analysis, and lessons learned, and I will discuss how to transfer this knowledge to collecting another manipulation dataset.

Research Qualifier Committee:
Matt Mason
Nancy Pollard
Artur Dubrawski
Stefanos Nikolaidis

The Robotics Institute celebration National Robotics Week with an Open House, Lab Tours, Demos, Talks and more...

Watch for details!

With a prosthetic device, people with a lower limb amputation can remain physically active, but most do not achieve medically recommended physical activity standards and are therefore at a greater risk of obesity and cardiovascular disease. Their reduced activity may be attributed to the 10 – 30% increase in energetic cost during walking compared to able-bodied individuals. Several active ankle-foot systems have been developed to provide external power during the push-off phase of gait, potentially alleviation this high cost. This talk will focus on the biologically inspired design of these devices, and several of our recent and ongoing projects exploring if and how people utilize external mechanical power to influence their metabolic effort, how this is influenced by the magnitude of power delivered, and the influence of the individual’s characteristics. I will then discuss our recent efforts to evaluate powered prosthetic technology in real-world environments.

Deanna Gates is an Assistant Professor in the Departments of Movement Science, Biomedical Engineering and Robotics at the University of Michigan. She earned her B.S. in Mechanical Engineering from the University of Virginia (2002), M.S. in Biomedical Engineering from Boston University (2004), and Ph.D. in Biomedical Engineering at the University of Texas at Austin (2009). Dr. Gates worked in engineering consulting and in civilian and military clinical gait laboratories, before arriving at the University of Michigan in 2012. Dr. Gates directs the Rehabilitation Biomechanics Laboratory focusing on the study of repetitive human movements such as walking and reaching. Throughout these studies, we try to determine which aspects of movement a person actively controls and how this function can most effectively be modeled. We can then use these models, and governing control strategies, to design both passive and active devices which can mimic biological function and restore or improve function in individuals with disability. Another focus of our research is determining appropriate outcomes to measure performance with new prosthetic and orthotic technology.

Faculty Host: Katharina Muelling

Deformable objects such as cables and clothes are ubiquitous in factories, hospitals, and homes. While a great deal of work has investigated the manipulation of rigid objects in these settings, manipulation of deformable objects remains under-explored. The problem is indeed challenging, as these objects are not straightforward to model and have infinite-dimensional configuration spaces, making it difficult to apply established approaches for motion planning and control. One of the key challenges in manipulating deformable objects is selecting a model which is efficient to use in a control loop, especially when an accurate model is not available. Our approach to control uses a set of simple models of the object, determining which model to use at the current time step via a novel Multi-Armed Bandit algorithm that reasons over estimates of model utility.

I will also present our work on interleaving planning and control for deformable object manipulation in cluttered environments, again without an accurate model of the object. Our method predicts when a controller will be trapped (e.g., by obstacles) and invokes a planner to bring the object near its goal. The key to making the planning tractable is to avoid simulating the motion of the object, instead only forward-propagating the constraint on overstretching. This approach takes advantage of the object’s compliance, which allows it to conform to the environment as long as stretching constraints are satisfied. Our method is able to quickly plan paths in environments with complex obstacle arrangements and then switch to the controller to achieve a desired object configuration.

Dmitry Berenson received a BS in Electrical Engineering from Cornell University in 2005 and received his Ph.D. degree from the Robotics Institute at Carnegie Mellon University in 2011, where he was supported by an Intel PhD Fellowship. He completed a post-doc at UC Berkeley in 2012 and was an Assistant Professor at WPI 2012-2016. He started as an Assistant Professor in the EECS Department and Robotics Institute at the University of Michigan in 2016. He received the IEEE RAS Early Career award in 2016.

Faculty Host: David Held

Creating realistic virtual humans has traditionally been considered a research problem in Computer Animation primarily for entertainment applications. With the recent breakthrough in collaborative robots and deep reinforcement learning, accurately modeling human movements and behaviors has become a common challenge faced by researchers in robotics, artificial intelligence, as well as Computer Animation. In this talk, I will focus on two different yet highly relevant problems: how to teach robots to move like humans and how to teach robots to interact with humans.

While Computer Animation research has shown that it is possible to teach a virtual human to mimic human athletes’ movements, transferring such complex controllers to robot hardware in the real world is perhaps even more challenging than learning the controllers themselves. In this talk, I will focus on two strategies to transfer highly dynamic skills from character animation to robots: teaching robots basic self-preservation motor skills and developing data-driven algorithms on transfer learning between simulation and the real world.

The second part of the talk will focus on robotic assistance with dressing, which is a prominent activities of daily living (ADLs) most commonly requested by older adults. To safely train a robot to physically interact with humans, one can design a generative model of human motion based on prior knowledge or recorded motion data. Although this approach has been successful in Computer Animation, such as generating locomotion, designing procedures for a loosely defined task, such as “being dressed”, is likely to be biased to the specific data or assumptions. I will describe a new approach to modeling human motion without being biased toward specific situations presented in the dataset.

C. Karen Liu is an associate professor in School of Interactive Computing at Georgia Tech. She received her Ph.D. degree in Computer Science from the University of Washington. Liu’s research interests are in computer graphics and robotics, including physics-based animation, character animation, optimal control, reinforcement learning, and computational biomechanics. She developed computational approaches to modeling realistic and natural human movements, learning complex control policies for humanoids and assistive robots, and advancing fundamental numerical simulation and optimal control algorithms. The algorithms and software developed in her lab have fostered interdisciplinary collaboration with researchers in robotics, computer graphics, mechanical engineering, biomechanics, neuroscience, and biology. Liu received a National Science Foundation CAREER Award, an Alfred P. Sloan Fellowship, and was named Young Innovators Under 35 by Technology Review. In 2012, Liu received the ACM SIGGRAPH Significant New Researcher Award for her contribution in the field of computer graphics.


Faculty Host: David Held

Data driven approaches to modeling time-series are important in a variety of applications from market prediction in economics to the simulation of robotic systems. However, traditional supervised machine learning techniques designed for i.i.d. data often perform poorly on these sequential problems. This thesis proposes that time series and sequential prediction, whether for forecasting, filtering, or reinforcement learning, can be effectively achieved by directly training recurrent prediction procedures rather then building generative probabilistic models.

To this end, we introduce a new training algorithm for learned time-series models, Data as Demonstrator (DaD), that theoretically and empirically improves multi-step prediction performance on model classes such as recurrent neural networks, kernel regressors, and random forests. Additionally, experimental results indicate that DaD can accelerate model-based reinforcement learning. We next show that latent-state time-series models, where a sufficient state parametrization may be unknown, can be learned effectively in a supervised way using predictive representations derived from observations alone. Our approach, Predictive State Inference Machines (PSIMs), directly optimizes – through a DaD-style training procedure – the inference performance without local optima by identifying the recurrent hidden state as a predictive belief over statistics of future observations. Finally, we experimentally demonstrate that augmenting recurrent neural network architectures with Predictive-State Decoders (PSDs), derived using the same objective optimized by PSIMs, improves both the performance and convergence for recurrent networks on probabilistic filtering, imitation learning, and reinforcement learning tasks. Fundamental to our learning framework is that the prediction of observable quantities is a lingua franca for building AI systems.

Thesis Committee:
J. Andrew Bagnell (Co-chair)
Martial Hebert (Co-chair)
Jeff Schneider
Byron Boots (Georgia Institute of Technology)


Robot controllers, including locomotion controllers, often consist of expert-designed heuristics. These heuristics can be hard to tune, particularly in higher dimensions. It is typical to use simulation to tune or learn these parameters and test on hardware. However, controllers learned in simulation often don't transfer to hardware due to model mismatch. This necessitates controller optimization directly on hardware. Experiments on walking robots can be expensive, due to the time involved and fear of damage to the robot. This has led to a recent interest in adapting data-efficient learning techniques to robotics. One popular method is Bayesian Optimization, a sample-efficient black-box optimization scheme. But the performance of Bayesian Optimization typically degrades in problems with higher dimensionality, including dimensionality of the scale seen in bipedal locomotion. We aim to overcome this problem by incorporating prior knowledge to reduce the  number of dimensions in a meaningful way, with a focus on bipedal locomotion. We propose two ways of doing this, hand-designed features based on knowledge of human walking and using neural networks to extract this information automatically. Our hand-designed features project the initial controller space to a 1-dimensional space, and show promise in simulation and on hardware. On the other hand, the automatically learned features can be of varying dimensions, and also lead to improvement on traditional Bayesian Optimization methods and perform competitively to our hand-designed features in simulation. Our hardware experiments are evaluated on the ATRIAS robot, while simulation experiments are done for two robots - ATRIAS and a 7-link biped model. Our results show that these feature transforms capture important aspects of walking and accelerate learning on hardware and perturbed simulation, as compared to traditional Bayesian Optimization and other optimization methods.

Thesis Committee:
Christopher G. Atkeson (Chair)
Hartmut Geyer
Oliver Kroemer
Stefan Schaal​ (MPI Tübingen and University of S0uthern California)


Subscribe to RI