Operation of mobile autonomous systems in real-world environments and their participation in the accomplishment of meaningful tasks requires a high-fidelity perceptual representation that enables efficient inference. It is challenging to reason efficiently in the space of sensor observations primarily due to the dependence of nature of measurements on the type of sensor, and in some cases, the prohibitive size of sensor data. A perceptual representation that abstracts out sensor nuances is thus required to enable effective and efficient reasoning in previously unknown environments.
In this talk, a probabilistic environment representation based on Hierarchical Gaussian Mixture Models that allows efficient high-fidelity modeling and inference towards enabling informed planning (active perception) on a computationally constrained mobile autonomous system, will be presented. An information-theoretic methodology to estimate the required model complexity and real-time viability on a computationally-constrained SoC via a GPU-based implementation will be discussed. An extension of the model to obtain a probabilistic representation of occupancy with an associated measure of uncertainty, via a regression framework, will also be presented.
Shobhit Srivastava is an M.S. student in the Robotics Institute at Carnegie Mellon University, advised by Prof. Nathan Michael. The primary focus of his research is to enable high-fidelity and efficient multimodal environment modeling on mobile autonomous systems to enable efficient inference with respect to the environment. He previously received his Bachelor's degree in Computer Science and Engineering from National Institute of Technology, Allahabad, India and has worked for a couple of years at Qualcomm Research before coming to CMU for his Masters.
Nathan Michael (Chair)
Artur W. Dubrawski
As robots become more reliable and user interfaces (UI) become more powerful, human-robot teams are being applied to more real world problems. Human-robot teams offer redundancy and heterogeneous capabilities desirable in scientific investigation, surveillance, disaster response, and search and rescue operations. Large teams are overwhelming for a human operator, so systems employ high level team plans to describe the operator’s supervisory roles and the team’s tasks and goals. In addition, UIs apply situational awareness (SA) techniques and mixed initiative (MI) invocation of services to manage the operator’s workload. However, current systems use static SA and MI settings which cannot capture changes in the plan’s context or the overall system configuration. The configuration for one domain, device, environment, or section of a plan may not be appropriate for others, limiting performance.
This thesis addresses these issues by developing a team plan language for human-robot teams and augments it with a situational awareness and mixed initiative (SAMI) markup language. SAMI markup captures SA techniques for UI components, MI settings for decision making, and constraints for algorithm selection at specific points in a team plan. In addition, we identify properties of the team plan language and use them to develop semantic and syntactic software agents which aid plan development.
To test the team plan language and markup’s ability to capture complex behavior and context specific needs, we design several experiments in simulation and deploy a large team of autonomous watercraft. Run-time statistics and the team’s ability to adapt to challenges “in the wild” are used to evaluate the effectiveness of the marked up language.
To assess the learnability of the language by non-experts, a user study evaluating a series of self-guided lessons is designed. Users with exposure to computer science concepts complete training material during which task performance and interviews are used to assess the effectiveness and scalability of the material.
These contributions demonstrate an approach to improve the accessibility of human-robot teams and their performance in complex environments.
Paul Scerri (Chair)
Julie Adams (Oregon State University)
This talk will describe two projects in the area of bioinspired soft robotics. The first project’s goal is to construct a mathematical framework for real-time control of a caterpillar-inspired robot system through dynamical modeling and sensor feedback. The dynamics and control framework will be validated on an experimental testbed to include soft, limbed robots powered by dielectric-elastomer actuators and shape-memory alloy. The long-term goal of the second project is to introduce a functionally hierarchical architecture and distributed control scheme for dexterous, underwater soft robot appendages with a high force-to-compliance ratio. The hierarchical design is inspired by the complex organization of endoskeletal elements, water vascular system, and tube-feet arrays in radially symmetrical echinoderms (such as sea stars, brittle stars, and basket stars).
Derek A. Paley is the Willis H. Young Jr. Professor of Aerospace Engineering Education in the Department of Aerospace Engineering and the Institute for Systems Research at the University of Maryland. He is the founding director of the Collective Dynamics and Control Laboratory and a member of the Maryland Robotics Center. Paley received the B.S. degree in Applied Physics from Yale University in 1997 and the Ph.D. degree in Mechanical and Aerospace Engineering from Princeton University in 2007. He is the recipient of the National Science Foundation CAREER award in 2010, the Presidential Early Career Award for Scientists and Engineers in 2012, and the AIAA National Capital Section Engineer of the Year in 2015. Paley’s research interests are in the area of dynamics and control, including cooperative control of autonomous vehicles, adaptive sampling with mobile networks, and spatial modeling of biological groups.
Active illumination systems use a controllable light source and a light sensor to measure properties of a scene. For such a system to work reliably across a wide range of environments it must be able to handle the effects of global light transport, bright ambient light, interference from other active illumination devices, defocus, and scene motion.
The goal of this thesis is to develop computational techniques and hardware arrangements to make active illumination devices based on commodity-grade components that work under real world conditions. We aim to combine the robustness of a scanning laser rangefinder with the speed, measurement density, compactness, and economy of a consumer depth camera.
Towards this end, we have made four contributions. The first is a computational technique for compensating for the effects of motion while separating the direct and global components of illumination. The second is a method that combines triangulation and depth from illumination defocus cues to increase the working range of a projector-camera system. The third is a new active illumination device that can efficiently image the epipolar component of light transport between a source and sensor. The device can measure depth using active stereo or structured light and is robust to many global light transport effects. Most importantly, it works outdoors in bright sunlight despite using a low power source. Finally, we extend the proposed epipolar-only imaging technique to time-of-flight sensing and build a low-power sensor that is robust to sunlight, global illumination, multi-device interference, and camera shake.
We believe that the algorithms and sensors proposed and developed in this thesis could find applications in a diverse set of fields including mobile robotics, medical imaging, gesture recognition, and agriculture.
Srinivasa G. Narasimhan (Chair)
William L. “Red” Whittaker
Wolfgang Heidrich (KAUST and University of British Columbia)
Kiriakos N. Kutulakos (University of Toronto)
Acting under uncertainty is a fundamental challenge for any decision maker in the real world. As uncertainty is often the culprit of failure, many prior works attempt to reduce the problem to one with a known state. However, this fails to account for a key property of acting under uncertainty: we can often gain utility while uncertain. This thesis presents methods that utilize this property in two domains: active information gathering and shared autonomy.
For active information gathering, we present a general framework for reducing uncertainty just enough to make a decision. To do so, we formulate the Decision Region Determination (DRD) problem, modelling how uncertainty impedes decision making. We present two methods for solving this problem, differing in their computational efficiency and performance bounds. We show that both satisfy adaptive submodularity, a natural diminishing returns property that imbues efficient greedy policies with near-optimality guarantees. Empirically, we show that our methods outperform those which reduce uncertainty without considering how it affects decision making.
For shared autonomy, we first show how the general problem of assisting with an unknown user goal can be modelled as one of acting under uncertainty. We then present our framework, based on Hindsight Optimization or QMDP, enabling us assist for a distribution of user goals by minimizing the expected cost. We evaluate our framework on real users, demonstrating that our method achieves goals faster, requires less user input, decreases user idling time, and results in fewer user-robot collisions than those which rely on predicting a single user goal. Finally, we extend our framework to learn how user behavior changes with assistance, and incorporate this model into cost minimization.
J. Andrew (Drew) Bagnell (Co-Chair)
Siddhartha S. Srinivasa (Co-Chair)
Wolfram Burgard, University of Freiburg
The goal of this dissertation is to develop computational models for robots to detect and sustain the spatial patterns of behavior that naturally emerge during free-standing group conversations with people. These capabilities have often been overlooked by the Human-Robot Interaction (HRI) community, but they are essential for robots to appropriately interact with and around people in many human environments.
In this work, we first develop a robotic platform for studying human-robot interactions, and contribute new experimental protocols to investigate group conversations with robots. The studies that we conducted with these protocols examine various aspects of these interactions, and experimentally validate the idea that people tend to establish spatial formations typical of human conversations with robots. These formations emerge as the members of the interaction cooperate to sustain a single focus of attention. They maximize their opportunities for monitoring one another's mutual perceptions during conversations.
Second, we introduce a general framework to track the lower-body orientation of free-standing people in human environments and to detect their conversational groups based on their spatial behavior. This framework takes advantage of the mutual dependency between the two problems. Lower-body orientation is a key descriptor of spatial behavior and, thus, can help detect group conversations. Meanwhile, knowing the location of group conversations can help estimate people's lower-body orientation, because these interactions often bias human spatial behavior. We evaluate this framework in a public computer vision benchmark for group detection, and show how it can be used to estimate the members of a robot's group conversation in real-time.
Third, we study how robots should orient with respect to a group conversation to cooperate to sustain the spatial arrangements typical of these interactions. To this end, we conduct an experiment to study the effects of varied orientation and gaze behaviors for robots during social conversations. Our results reinforce the importance of communicative motion behavior for robots, and suggest that their body and gaze behaviors should be designed and controlled jointly, rather than independently of each other. We then show in simulation that it is possible to use reinforcement learning techniques to generate socially appropriate orientation behavior for robots during group conversations. These techniques reduce the amount of engineering required to enable robots to sustain spatial formations typical of conversations while communicating attentiveness to the focus of attention of the interaction.
Overall, our efforts show that reasoning about spatial patterns of behavior is useful for robots. This reasoning can help with perception tasks as well as generating appropriate robot behavior during social group conversations.
Aaron Steinfeld (Co-chair)
Scott E. Hudson (Co-chair)
Brian Scassellati (Yale University)
Rapid autonomous exploration of challenging, GPS-denied environments, such as underground mines, provides essential information to search and rescue and defense operations.We pursue a distributed perception strategy that develops a consistent global map between a team of robots in environments that exhibit repetitive structure, leading to ambiguities in observation correspondences and complicating the estimation of relative transforms between robots and the generation of a consistent global map.
Real-world communication constraints limit a robot from sharing large numbers of observations at high fidelity. Naively simplifying the information leads to loss of unique features and an increase in perceptual aliasing. Towards sharing the most relevant subset of information, we develop a scan utility function based on information theoretic measures for scan information and feature-based place recognition approaches to assess loop closure potential. Using the utility function to rank scans, we formulate an offer-response-request framework, Communication Constrained Information Routing (CCIR), that ensures operation under stringent bandwidth restrictions.
Given the ability to share rich 3D information over constrained networks, we pursue full 3D mapping via extensions and robustification techniques. The robust measures we introduce allow operation in the mine given substantial perceptual aliasing.
To enable operation in environments that exhibit aliasing that exceeds the performance characteristics of the developed framework, we detail first results for an approach that moves away from feature-based techniques and introduces a methodology utilizing Hierarchical Gaussian Mixture Models. Through regeneration of the point cloud from the HGMM model and Generalized Iterative Closest Point algorithms, we are able to detect loop closures accurately with an outlier rate significantly lower than feature-based methods.
Vibhav Ganesh is a second year master's student in the Robotics Institute at Carnegie Mellon University, advised by Prof. Nathan Michael. Prior to his master's program, he received his B.S in Computer Science at Carnegie Mellon University. He is interested in distributed perception algorithms with real-world constraints.
Nathan Michael (Chair)
Algorithms for human motion understanding have a wide variety of applications, including health monitoring, performance assessment, and user interfaces. However, differences between individual styles make it difficult to achieve robust performance, particularly for individuals who were not in the training population. We believe that adapting algorithms to individual behaviors is essential for effective human motion understanding. This thesis therefore explores algorithms for personalizing a general classifier to particular test subjects given their unlabeled data or small quantities of labeled data.
Many applications, such as action or gesture recognition, contain multiple classes. For example, the REALDISP activity recognition dataset contains 33 different actions, such as walking, running, jogging, and cycling. In this thesis, we present a multi-class formulation of the Selective Transfer Machine (STM), which combines Kernel Mean Matching with a Support Vector Machine (SVM) to personalize the classifier given a test subject's unlabeled data. We apply this algorithm to two real and four synthetic datasets, and propose several potential improvements.
In some applications, labeling events accurately in training data is difficult or impossible. Algorithms for these applications should only require weakly-labeled training data. In this thesis, we evaluate five standard, weakly-supervised algorithms on Parkinson's Disease (PD) tremor detection. We also describe a modification that allows algorithms to take advantage of knowing the approximate amount of tremor within each segment. We find that these modified algorithms show little decrease in performance as the length of the training time segments increases to ten minutes. We propose to develop a personalized, weakly-supervised algorithm and apply it to PD tremor detection in wrist-worn accelerometer data collected in patients' homes.
In other applications, such as when measuring disease severity or surgeon expertise, labels come from a continuous spectrum. In these cases, a classification algorithm, which assumes discrete classes, may not be the best approach. Algorithms that attempt to fit a function to the data are more appropriate. We propose to apply personalization to modeling surgeon learning curves. We also plan to use personalized regression to predict surgeon expertise on data collected from the da Vinci surgical robot.
In summary, this thesis will explore the application of personalization to human activity and surgical gesture recognition, PD tremor detection, surgeon learning curve modeling, and surgical expertise prediction. In doing so, we will develop personalized algorithms in the context of multi-class classification, weakly-supervised classification, and function modeling.
Fernando De la Torre (Co-chair)
Jessica Hodgins (Co-Chair)
Anthony Jarc (Intuitive Surgical, Inc.)
Pose estimation is central to several robotics applications such as registration, manipulation, SLAM, etc. In this thesis, we develop probabilistic approaches for fast and accurate pose estimation. A fundamental contribution of this thesis is formulating pose estimation in a parameter space in which the problem is truly linear and thus globally optimal solutions can be guaranteed. It should be stressed that the approaches developed in this thesis are indeed inherently linear, as opposed to linearization or other approximations commonly made by existing techniques, which are known to be computationally expensive and highly sensitive to initial estimation error.
This thesis will demonstrate that the choice of probability distribution significantly impacts performance of the estimator. The distribution must respect the underlying structure of the parameter space to ensure any optimization, based on such a distribution, produces a globally optimal estimate, despite the inherent nonconvexity of the parameter space.
Furthermore, in applications such as registration and three-dimensional reconstruction, the correspondence between the measurements and the geometric model is typically unknown. In this thesis we develop probabilistic methods to deal with cases of unknown correspondence.
We plan to extend our approaches to applications requiring dynamic pose estimation. We also propose to incorporate probabilistic means for finding the data association, inspired by recent work of Billings et. al. Finally, we will develop a filtering approach using a Gilitschenski distribution, that considers the constraints of both rotation and translation parameters without decoupling them.
Howie Choset (Chair)
Russell H. Taylor (Johns Hopkins University)
Nabil Simaan (Vanderbilt University)
Mechatronic Design (16-778/18-578/24-778) teams will briefly describe and show videos of their ShipBot and Window Washer machines (2:30-3:30 p.m.), followed by an at-table exhibition (3:30-4:00 p.m.) and competition (4:00-5:30 p.m.).