Rapid autonomous exploration of challenging, GPS-denied environments, such as underground mines, provides essential information to search and rescue and defense operations.We pursue a distributed perception strategy that develops a consistent global map between a team of robots in environments that exhibit repetitive structure, leading to ambiguities in observation correspondences and complicating the estimation of relative transforms between robots and the generation of a consistent global map.

Real-world communication constraints limit a robot from sharing large numbers of observations at high fidelity. Naively simplifying the information leads to loss of unique features and an increase in perceptual aliasing. Towards sharing the most relevant subset  of information, we develop a scan utility function based on information theoretic measures for scan information and feature-based place recognition approaches to assess loop closure potential. Using the utility function to rank scans, we formulate an offer-response-request framework, Communication Constrained Information Routing (CCIR), that ensures operation under stringent bandwidth restrictions.

Given the ability to share rich 3D information over constrained networks, we pursue full 3D mapping via extensions and robustification techniques. The robust measures we introduce allow operation in the mine given substantial perceptual aliasing.

To enable operation in environments that exhibit aliasing that exceeds the performance characteristics of the developed framework, we detail first results for an approach that moves away from feature-based techniques and introduces a methodology utilizing Hierarchical Gaussian Mixture Models. Through regeneration of the point cloud from the HGMM model and Generalized Iterative Closest Point algorithms, we are able to detect loop closures accurately with an outlier rate significantly lower than feature-based methods.

Vibhav Ganesh is a second year master's student in the Robotics Institute at Carnegie Mellon University, advised by Prof. Nathan Michael. Prior to his master's program, he received his B.S in Computer Science at Carnegie Mellon University. He is interested in distributed perception algorithms with real-world constraints.

Committee Members
Nathan Michael (Chair)
Artur Dubrawski
Ben Eckart

Algorithms for human motion understanding have a wide variety of applications, including health monitoring, performance assessment, and user interfaces.  However, differences between individual styles make it difficult to achieve robust performance, particularly for individuals who were not in the training population.  We believe that adapting algorithms to individual behaviors is essential for effective human motion understanding.  This thesis therefore explores algorithms for personalizing a general classifier to particular test subjects given their unlabeled data or small quantities of labeled data.

Many applications, such as action or gesture recognition, contain multiple classes.  For example, the REALDISP activity recognition dataset contains 33 different actions, such as walking, running, jogging, and cycling.  In this thesis, we present a multi-class formulation of the Selective Transfer Machine (STM), which combines Kernel Mean Matching with a Support Vector Machine (SVM) to personalize the classifier given a test subject's unlabeled data.  We apply this algorithm to two real and four synthetic datasets, and propose several potential improvements.

In some applications, labeling events accurately in training data is difficult or impossible.  Algorithms for these applications should only require weakly-labeled training data.  In this thesis, we evaluate five standard, weakly-supervised algorithms on Parkinson's Disease (PD) tremor detection.  We also describe a modification that allows algorithms to take advantage of knowing the approximate amount of tremor within each segment.  We find that these modified algorithms show little decrease in performance as the length of the training time segments increases to ten minutes.  We propose to develop a personalized, weakly-supervised algorithm and apply it to PD tremor detection in wrist-worn accelerometer data collected in patients' homes.

In other applications, such as when measuring disease severity or surgeon expertise, labels come from a continuous spectrum. In these cases, a classification algorithm, which assumes discrete classes, may not be the best approach. Algorithms that attempt to fit a function to the data are more appropriate.  We propose to apply personalization to modeling surgeon learning curves.  We also plan to use personalized regression to predict surgeon expertise on data collected from the da Vinci surgical robot.

In summary, this thesis will explore the application of personalization to human activity and surgical gesture recognition, PD tremor detection, surgeon learning curve modeling, and surgical expertise prediction. In doing so, we will develop personalized algorithms in the context of multi-class classification, weakly-supervised classification, and function modeling.

Thesis Committee:
Fernando De la Torre (Co-chair)
Jessica Hodgins (Co-Chair)
Artur Dubrawski
Anthony Jarc (Intuitive Surgical, Inc.)

Copy of Proposal Document

Pose estimation is central to several robotics applications such as registration, manipulation, SLAM, etc. In this thesis, we develop probabilistic approaches for fast and accurate pose estimation. A fundamental contribution of this thesis is formulating pose estimation in a parameter space in which the problem is truly linear and thus globally optimal solutions can be guaranteed. It should be stressed that the approaches developed in this thesis are indeed inherently linear, as opposed to linearization or other approximations commonly made by existing techniques, which are known to be computationally expensive and highly sensitive to initial estimation error.

This thesis will demonstrate that the choice of probability distribution significantly impacts performance of the estimator. The distribution must respect the underlying structure of the parameter space to ensure any optimization, based on such a distribution, produces a globally optimal estimate, despite the inherent nonconvexity of the parameter space.

Furthermore, in applications such as registration and three-dimensional reconstruction, the correspondence between the measurements and the geometric model is typically unknown. In this thesis we develop probabilistic methods to deal with cases of unknown correspondence.

We plan to extend our approaches to applications requiring dynamic pose estimation. We also propose to incorporate probabilistic means for finding the data association, inspired by recent work of Billings et. al. Finally, we will develop a filtering approach using a Gilitschenski distribution, that considers the constraints of both rotation and translation parameters without decoupling them.

Thesis Committee:
Howie Choset (Chair)
Michael Kaess
Simon Lucey
Russell H. Taylor (Johns Hopkins University)
Nabil Simaan (Vanderbilt University)

Mechatronic Design (16-778/18-578/24-778) teams will briefly describe and show videos of their ShipBot and Window Washer machines (2:30-3:30 p.m.), followed by an at-table exhibition (3:30-4:00 p.m.) and competition (4:00-5:30 p.m.).

Nine RI MRSD program student teams will use posters, videos, and hardware to show their project work on robots for autonomous driving perception, UAV multimodal mapping, litter pick-up, autonomous car drifting, the Amazon Picking Challenge, UAV search and rescue, multi-camera dome calibration, high-mobility virtual reality, and soybean inspection.

Three teams of undergraduate Robotics double-major students will do the same for their projects:  a hockey-goalie training robot, a system of swimming-pool drink-delivery robots, and a system for autonomous road-marking or line-drawing. 

When the poster session ends at 5:00 pm, any interested persons may follow the undergraduate groups to the NSH highbay for a brief live demo (5:00-6:00 pm).


We present an approach to efficiently detect the 2D pose of multiple people in an image. The approach uses a nonparametric representation, which we refer to as Part Affinity Fields (PAFs), to learn to associate body parts with individuals in the image. The architecture encodes global context, allowing a greedy bottom-up parsing step that maintains high accuracy while achieving realtime performance, irrespective of the number of people in the image. The architecture is designed to jointly learn part locations and their association via two branches of the same sequential prediction process. Our method placed first in the inaugural COCO 2016 keypoints challenge, and significantly exceeds the previous state-of-the-art result on the MPII Multi-Person benchmark, both in performance and efficiency.  Commitee:Yaser Sheikh (Advisor)Deva RamananAayush Bansal 

Humans use subtle and elaborate body signals to convey their thoughts, emotions, and intentions. "Kinesics" is a term that refers to the study of such body movements used in social communication, including facial expressions and hand gestures. Understanding kinesic signals is fundamental to understanding human communication; it is among the key technical barriers to making machines that can genuinely communicate with humans. Yet, the encoding of conveyed information by body movement is still poorly understood.

This thesis proposal is focused on two major challenges in building a computational understanding of kinesic communication: (1) measuring full body motion as a continuous high bandwidth kinesic signal; and (2) modeling kinesic communication as information flow between coupled agents that continuously predict each others' response signals.  To measure kinesic signals between multiple interacting people, we first develop the Panoptic Studio, a massively multiview system composed of more than five hundred camera sensors. The large number of views allows us to develop a method to robustly measure subtle 3D motions of bodies, hands, and faces of all individuals in a large group of people. To this end, a dataset containing 3D kinesic signals of more than two hundred sequences from hundreds of participants is collected and shared publicly.

The Panoptic studio allows us to measure kinesic signals of a large group of interacting people for the first time. We propose to model these signals as information flow in a communication system. The core thesis of our approach is that a meaningful encoding of body movement will emerge from representations that are optimized for efficient prediction of these kinesic communication signals. We hope to see this approach inspire continuous and quantitative models in the future study of social behavior.

Thesis Committee:
Taser Sheikh (Chair)
Takeo Kanade
Louis-Philippe Morency
Mina Cikara (Harvard University)
David Forsyth (University of Illinois at Urbana-Champaign)

Copy of Proposal Document

Robots manipulate with super-human speed and dexterity on factory floors. But yet they fail even under moderate amounts of clutter or uncertainty. However, human teleoperators perform remarkable acts of manipulation with the same hardware. My research goal is to bridge the gap between what robotic manipulators can do now and what they are capable of doing.

What human operators intuitively possess that robots lack are models of interaction between the manipulator and the world that go beyond pick-and-place. I will describe our work on nonprehensile physics-based manipulation that has produced simple but effective models, integrated with proprioception and perception, that has enabled robots to fearlessly push, pull, and slide objects, and reconfigure clutter that comes in the way of their primary task.

But human environments are also filled with humans. Collaborative manipulation is a dance, demanding the sharing of intentions, inferences, and forces between the robot and the human. I will also describe our work on the mathematics of human-robot interaction that has produced a framework for collaboration using Bayesian inference to model the human collaborator, and trajectory optimization to generate fluent collaborative plans.

Finally, I will talk about our new initiative on assitive care that focuses on marrying physics, human-robot collaboration, control theory, and rehabilitation engineering to build and deploy caregiving systems.

Siddhartha Srinivasa is the Finmeccanica Associate Professor at The Robotics Institute at Carnegie Mellon University. He works on robotic manipulation, with the goal of enabling robots to perform complex manipulation tasks under uncertainty and clutter, with and around people. To this end, he founded and directs the Personal Robotics Lab, and co-directs the Manipulation Lab. He has been a PI on the Quality of Life Technologies NSF ERC, DARPA ARM-S and the CMU CHIMP team on the DARPA DRC. Sidd is also passionate about building end-to-end systems (HERB, ADA, HRP3, CHIMP, Andy, among others) that integrate perception, planning, and control in the real world. Understanding the interplay between system components has helped produce state of the art algorithms for object recognition and pose estimation (MOPED), and dense 3D modeling (CHISEL, now used by Google Project Tango).

Sidd received a B.Tech in Mechanical Engineering from the Indian Institute of Technology Madras in 1999, an MS in 2001 and a PhD in
2005 from the Robotics Institute at Carnegie Mellon University. He played badminton and tennis for IIT Madras, captained the CMU squash team, and lately runs competitively.

Faculty Host: Martial Hebert

Robotic swarms are multi-robot systems whose global behavior emerges from local interactions between individual robots and spatially proximal neighboring robots. Each robot can be programmed with several local control laws that can be activated depending on an operator's choice of global swarm behavior (e.g. flocking, aggregation, formation control, area coverage). In contrast to other multi-robot systems, robotic swarms are inherently scalable since they are robust to addition and removal of members with minimal system reconfiguration. This makes them ideal for applications such as search and rescue, environmental exploration and surveillance.

For practical missions, which may require a combination of swarm behaviors and have dynamically changing mission goals, human interaction with the robotic swarm is necessary. However, human-swarm interaction is complicated by the fact that a robotic swarm is a complex distributed dynamical system, so its state evolution depends on the sequence as well as timing of the supervisory inputs. Thus, it is difficult to predict the effects of an input on the state evolution of the swarm. More specifically, after becoming aware of a change in mission goals, it is unclear at what point the operator must convey this information to the swarm or which combination of behaviors to use to accomplish the new goals.

The main challenges we seek to address in this thesis are characterizing the effects of input timing on swarm performance and using this theory to inform automated composition of swarm behaviors to accomplish updated mission goals. 

We begin by formalizing the notion of Neglect Benevolence --- the idea that delaying the application of an input can sometimes be beneficial to overall swarm performance --- and using the developed theory to demonstrate experimentally that humans can learn to approximate optimal input timing. By restricting our behavior library to consensus-based swarm behaviors, we then apply results from control theory to present an algorithm for automated scheduling of swarm behaviors to time-optimally accomplish multiple unordered goals. We also present an algorithm that solves the swarm behavior composition problem when our library contains general swarm behaviors, but the switch times are known.

In our completed work, we have made significant progress towards the swarm behavior composition problem from the perspective of scheduling. In our proposed future work, we plan to (1) extend our work on behavior scheduling by simultaneously relaxing assumptions on switch times and the types of behaviors in the library and (2) study behavior composition from the perspective of synthesis. In this context, synthesis describes the act of appropriately instantiating from a set of swarm meta-behaviors, the necessary concrete swarm behaviors to complete a desired task.

Thesis Committee:
Katia Sycara (Chair)
Howie Choset
Maxim Likhachev
Nilanjan Chakraborty (Stony Brook University)

To make intelligent decisions, robots often use models of the stochastic effects of their actions on the world. Unfortunately, in complex environments, it is often infeasible to create models that are accurate in every plausible situation, which can lead to suboptimal performance. This thesis enables robots to reason about model inaccuracies to improve their performance. The thesis focuses on model inaccuracies that are subtle --i.e., they cannot be detected from a single observation-- and context-dependent --i.e., they affect particular regions of the robot's state-action space. Furthermore, this work enables robots to react to model inaccuracies from sparse execution data.

Our approach consists of enabling robots to explicitly reason about parametric Regions of Inaccurate Modeling (RIMs) in their state-action space. We enable robots to detect these RIMs from sparse execution data, to correct their models given these detections, and to plan accounting for uncertainty with respect to these RIMs. To detect and correct RIMs, we first develop optimization-based algorithms that work effectively online in low-dimensional domains. To extend this approach to high-dimensional domains, we develop a search-based Feature Selection algorithm, which relies on the assumption that RIMs are intrinsically low-dimensional but embedded in a high-dimensional space. Finally, we enable robots to make plans that account for their uncertainty about the accuracy of their models.

We evaluate our approach on various complex robot domains. Our approach enables the CoBot mobile service robots to autonomously detect inaccuracies in their motion models, despite their high-dimensional state-action space: the CoBots detect that they are not moving correctly in particular areas of the building, and that their wheels are starting to fail when making turns. Our approach enables the CMDragons soccer robots to improve their passing and shooting models online in the presence of opponents with unknown weaknesses and strengths. Finally, our approach enables a NASA spacecraft landing simulator to detect subtle anomalies, unknown to us beforehand, in their streams of high-dimensional sensor-output and actuator-input data.

Thesis Committee:
Reid Simmons (Co-chair)
Manuela Veloso (Co-chair)
Jeff Schneider
Brian Williams (Massachusetts Institute of Technology)

Copy of Thesis Document


Subscribe to RI