Automatic analysis of facial actions (AFA) can reveal a person's emotion, intention, and physical state, and make possible a wide range of applications.  To enable reliable, valid, and efficient AFA, this thesis investigates both supervised and unsupervised learning.

Supervised learning for AFA is challenging, in part, because of individual differences among persons in face shape and appearance and variation in video acquisition and context.  To improve generalizability across persons, we propose a transductive framework, Selective Transfer Machine (STM), which personalizes generic classifiers through joint sample reweighting and classifier learning.  By personalizing classifiers, STM offers improved generalization to unknown persons.  As an extension, we develop a variant of STM for use when partially labeled data are available.

Additional challenges for supervised learning include learning an optimal representation for classification, variation in base rates of action units (AUs), correlation between AUs and temporal consistency.  While these challenges could be partly accommodated with an SVM or STM, a more powerful alternative is afforded by an end-to-end supervised framework (i.e., deep learning).  We propose a convolutional network with long short-term memory (LSTM) and multi-label sampling strategies.  We compared SVM, STM and deep learning approaches with respect to AU occurrence and intensity in and between BP4D+ and GFT databases (size = ~0.6 million annotated frames).

Annotated video is not always possible or desirable. We introduce an unsupervised Branch-and-Bound framework to discover correlated facial actions in un-annotated video. We term this approach Common Event Discovery (CED).  We evaluate CED in video and motion capture data.  CED achieved moderate convergence with supervised approaches and enabled discovery of novel patterns occult to supervised approaches.

Thesis Committee:
Fernando De la Torre (Co-chair)
Jeffrey F. Cohn (Co-chair)
Simon Lucey
Deva Ramanan
Vladimir Pavlovic (Rutgers University)

Copy of Thesis Document

Bipedal animals exhibit a diverse range of gaits and gait transitions, which can robustly travel over terrains of varying grade, roughness, and compliance. Bipedal robots should be capable of the same. Despite these clear goals, state-of-the-art humanoid robots have not yet demonstrated locomotion behaviors that are as robust or varied as those of humans and animals. Current model-based controllers for bipedal locomotion target individual gaits rather than realizing walking and running behaviors within a single framework. Recently, researchers have proposed using the spring mass model (SMM) as a compliant locomotion paradigm to create a unified controller for multiple gaits. Initial studies have revealed that policies exist for the SMM that exhibit diverse behaviors including walking, running, and transitions between them. However, many of these control laws are designed empirically and do not necessarily maximize robustness. Furthermore, the vast majority of these controllers have not yet been demonstrated on physical hardware, so their utility for real-world machines remains unclear.

This thesis will investigate gait transition policies for the SMM that maximize an objective measure of robustness. We hypothesize that these control policies exist within the SMM framework and can be numerically calculated with guaranteed optimality and convergence. Specifically, we aim to investigate the following two claims. 1) All proposed SMM gait transition policies can be computed using reinforcement learning techniques with linear function approximators. 2) This method can generate new policies which maximize the basin of attraction between walking and running states.  Initial results show that these reinforcement learning methods can indeed learn existing SMM policies previously found through Poincare analysis. If these algorithms are successful in finding globally optimal policies, they may lead to bipedal locomotion controllers with both diverse behaviors and largely improved robustness.

We will experimentally evaluate the utility of these control policies for human-scale bipedal robots. This thesis will extend our analysis of SMM policies on the ATRIAS robot platform to include multiple gaits and gait transitions. Our initial hardware implementation of SMM running has revealed two technical challenges we will address. 1) Modeling errors for both the simplified model and higher-order robot lead to performance degradation from simulation. We will investigate improving this with online methods of parameter estimation and learning. 2) Our experiments have only evaluated planar running and must be extended to include 3D locomotion. If these two challenges are overcome we will have experimentally evaluated SMM running, walking, and transitions between on a physical bipedal robot.

Thesis Committee:
Hartmut Geyer (Chair)
Chris Atkeson
Stelian Coros
Jan Peters (TU Darmstadt)

Copy of Proposal Document

We research on autonomous mobile robots with a seamless integration of perception, cognition, and action. In this talk, I will first introduce our CoBot service robots and their novel localization and symbiotic autonomy, which enable them to consistently move in our buildings, now for more than 1,000km. I will then introduce multiple human-robot interaction contributions, and detail the use and planning for language-based complex commands, and robot learning from instruction and correction. I will conclude with the robot explanation generation to reply to language-based requests about their autonomous experience. The work reported is joint with my students and collaborators in the CORAL research group.

Manuela M. Veloso is the Herbert A. Simon University Professor in the School of Computer Science at Carnegie Mellon University. She is the Head of the Machine Learning Department, and she has joint appointment in the Computer Science Department and courtesy appointments in the Robotics Institute and Electrical and Computer Engineering Department. She researches in Artificial Intelligence and Robotics. She founded and directs the CORAL research laboratory, for the study of autonomous agents that Collaborate, Observe, Reason, Act, and Learn, www.cs.cmu.edu/~coral. Professor Veloso is ACM Fellow, IEEE Fellow, AAAS Fellow, AAAI Fellow, Einstein Chair Professor, the co-founder and past President of RoboCup, and past President of AAAI. Professor Veloso and her students research with a variety of autonomous robots, including mobile service robots and soccer robots.

Faculty Host: Martial Hebert

Progress in ophthalmology over the past decade moved preclinical data to clinical proof-of-concept studies bringing innovative therapeutic strategies to the market. Diseases such as retinitis pigmentosa (RP) and age-related macular degeneration (AMD) destroy photoreceptors but leave intact and functional a significant number of inner retinal cells. Retinal prostheses have demonstrated ability to reactivate the remaining retinal circuits at the level of bipolar or ganglion cells, after the photoreceptor loss. Recent clinical trials have demonstrated partial restoration of vision in blind people by epiretinal (Second Sight Medical Products, Pixium Vision) and subretinal (Retina Implant AG) implants, in clinical trials and practice now. Despite a limited number of electrodes, some patients were even able to read words and recognize high-contrast objects. Currently, researchers at the Stanford University and Pixium Vision in collaboration with Institut de la Vision develop a wirelessly powered photovoltaic prosthesis in which each pixel of the subretinal array directly converts patterned pulsed near-infrared light projected from video goggles into local electric current to stimulate the nearby retinal neurons. A new asynchronous dynamic visual sensor whose function mimics photoreceptor and retinal cell responses is also under development. Optogenetics (currently under preclinical evaluation in primates) and cell therapy (ongoing first safety and tolerability clinical trials with hESC- and iPSCs-derived RPE) provide alternative approaches for vision restoration in patients with advanced stages of retinal degeneration. Combination of different therapeutic strategies may offer enhanced therapeutic effectiveness and more efficient ways to save vision. These new therapeutic tools call for identification of appropriate patient selection criteria and methods to evaluate treatments’ efficiency and assess the real benefit experienced by the patients.

José-Alain Sahel studied medicine at the Medical School of Paris University and ophthalmology at the University of Strasbourg and at Harvard University (Boston-Cambridge, USA). He was appointed Professor of Ophthalmology at the University Louis Pasteur, Strasbourg. Currently, José-Alain Sahel is Professor of Ophthalmology at Pierre and Marie Curie University Medical School, Paris, France and Cumberlege Professor of Biomedical Sciences at the Institute of Ophthalmology-University College London, UK. He chairs the Departments of Ophthalmology at the Quinze-Vingts National Eye Hospital and at the Rothschild Ophthalmology Foundation. He was recently appointed Professor and Chairman of the Department of Ophthalmology at the University of Pittsburgh Medical School and The Eye and Ear Foundation Endowed Chair. The primary focus of Sahel’s fundamental and clinical research is the understanding of the mechanisms associated with retinal degeneration, together with the conception, development and evaluation of innovative treatments for retinal diseases, with a special focus on genetic rod-cone dystrophies (e.g. neuroprotection, stem cells, gene therapy, pharmacology, and artificial retina).

Faculty Host: Martial Hebert

The development of fast randomized algorithms for geometric path planning – computing collision-free paths for high dimensional systems – was a major achievement in the field of motion planning in the 2000's. But since then, recent advances in affordable robot sensors, actuators, and systems have changed the robotics playing field, making many of the assumptions of geometric path planning obsolete. This talk will present new mathematical paradigms and algorithms that are beginning to address some of the issues faced in robotics today and into the future. Specifically, this talk will address the issues of optimality, plan interpretability, planning with contact, and integrating planning with perception and learning.

Kris Hauser is an Associate Professor at the Pratt School of Engineering at Duke University with a joint appointment in the Electrical and Computer Engineering Department and the Mechanical Engineering and Materials Science Department. He received his PhD in Computer Science from Stanford University in 2008, bachelor's degrees in Computer Science and Mathematics from UC Berkeley in 2003, and worked as a postdoctoral fellow at UC Berkeley. He then joined the faculty at Indiana University from 2009-2014, where he started the Intelligent Motion Lab. He is a recipient of a Stanford Graduate Fellowship, Siebel Scholar Fellowship, and the NSF CAREER award.

Faculty Host: Sidd Srinivasa



Unmanned Aerial System (UASs) are increasingly being used for everything from crop surveying to pipeline monitoring. They are significantly cheaper than the traditional manned airplane or helicopter approaches to obtaining aerial imagery and sensor data. The next generation of UASs, however, will do more than simply observe. In this talk, I will discuss recent advances we have made in the Nimbus Lab in developing the first UAS that can ignite prescribed fires. Prescribed fire is a critical tool used to improve habitats, combat invasive species, and reduce fuels to prevent wildfires. In the United States alone federal and state governments use prescribed burns on over 3 million acres each year, with private land owners prescribing even more. Yet this activity can be extremely dangerous, especially when performing interior ignitions in difficult terrain.

In this talk, I will discuss the history of this project and the challenges associated with flying near and igniting fires. In addition, I will detail the mechanical and software design challenges we have had to overcome in this project. I will also present the results of the first two prescribed burns that were successfully ignited by a UAS. Finally, I will discuss automated software analysis techniques we are developing to detect and correct system errors to reduce risk and increase safety when using UASs to ignite prescribed burns.

Dr. Carrick Detweiler is an Associate Professor in the Computer Science and Engineering department at the University of Nebraska-Lincoln. He co-directs and co-founded the Nebraska Intelligent MoBile Unmanned Systems (NIMBUS) Lab at UNL. His research focuses on improving the robustness and safety of aerial robots and sensor systems operating in the wild. Carrick obtained his B.A. in 2004 from Middlebury College and his Ph.D. in 2010 from MIT CSAIL. He is a Faculty Fellow at the Robert B. Daugherty Water for Food Institute at UNL and recently received the 2016 College of Engineering Edgerton Innovation Award. He is currently leading NSF and USDA projects focused on developing the systems and software to enable interactions of UAVs with water, fire, and crops. In addition to research activities, Carrick actively promotes the use of robotics in the arts through workshops and collaborations with the international dance companies Pilobolus and STREB.

Faculty Host: Stephen Nuske

Mobile robots are increasingly being deployed in the real world in response to a heightened demand for applications such as transportation, delivery and inspection. The motion planning systems for these robots are expected to have consistent performance across the wide range of scenarios that they encounter. While state-of-the art planners can be adapted to solve these real-time kinodynamic planning problems, their performance varies vastly across diverse scenarios. This thesis proposes that the motion planner for a mobile robot must adapt its search strategy to the distribution over planning problems that the robot encounters.

We address three principal challenges of this problem. Firstly, we show that even when the planning problem distribution is fixed, designing a non-adaptive planner can be challenging due to the unpredictability of its performance. We discuss how to alleviate this issue by leveraging a diverse ensemble of planners. Secondly, when the distribution is varying, we require a meta-planner that can use context to automatically select an ensemble from a library of black-box planners. We show both theoretically and empirically that greedily training a list of predictors to focus on failure cases leads to an effective meta-planner. Finally, in the interest of computational efficiency, we want a white-box planner that adapts its search strategy during a planning cycle. We show how such a strategy can be trained efficiently in a data-driven imitation learning framework.

Based on our preliminary investigations, we propose to examine three sub-problems that will lead to an effective adaptive motion planning framework. The first is learning heuristic and collision checking policies that optimize search effort by adapting to the distribution of obstacles in the environment. The second is to train context efficient meta-planners that use planner performance as additional feedback. The third is to automatically deal with failure cases that occur during online execution.

We evaluate the efficacy of our framework on a spectrum of motion planning problems with a primary focus on an autonomous full-scale helicopter. We expect that our framework will enable mobile robots to navigate seamlessly across different missions without the need for human intervention.

Thesis Committee:
Sebastian Scherer (Chair)
Siddhartha Srinivasa
Martial Hebert
Ashish Kapoor (Microsoft Research)

Copy of Proposal Document

Mine rescue robots rely heavily on visual cameras and LIDAR, which succeed in clear conditions. However, the worst mine disasters are caused by roof falls, explosions, and fires, which generate thick dust and smoke that obscure traditional sensing and thwart robot perception.

This talk presents the failures of traditional sensing techniques in smoky conditions. Prior classical work investigated sonar, radar, and LIDAR. Sonar is unaffected by smoke but is only useful for safeguarding and hyperlocal navigation. Radar is coarse and lacks the resolution required for robot navigation. LIDAR and cameras are the sensors of choice, and prior perception methods include LIDAR-camera fusion for mapping, SLAM for exploratory modeling, and loop closure for navigating mine corridor networks. However, cameras and LIDAR—and hence their associated navigation methodologies—fail in heavy smoke.

This presentation introduces the merits of thermal and Episcan3D sensing in these environments. The Episcan3D is a new class of sensor that improves viewing and range sensing through light smoke. Rather than broad illumination, which creates whiteout in a smoky scene, the Episcan3D illuminates with only a single ray at a time, which reduces scatter. Yet, even the Episcan3D is obscured in heavy smoke. Thermal imaging is not obscured by smoke but succeeds best outdoors where large thermal gradients exist. Underground mines are isolated from large fluctuations in temperature, so thermal features are often too indistinct and sparse for traditional SLAM. This research specialized direct SLAM methods for operation on thermal imagery and evaluated their suitability for robot navigation in underground mines.

During this research, a multi-modal dataset was collected for future work toward robotic underground mine rescue. A rover carrying visual, thermal, inertial, and LIDAR sensors was deployed and driven through a smoke-filled mine. Applications of this dataset for future research include thermal SLAM, subterranean navigation, multi-modal mapping, sensor fusion, and victim identification.

Joe Bartels is a Ph.D. student in the Robotics Institute. His current research interests include sensing and perception for visually degraded environments. Prior to attending Carnegie Mellon University, Joe completed a B.S. and M.S. in Mechanical Engineering at the University of Nebraska-Lincoln.

Presented as part of the Robotics Institute Speaking Qualfier.

An iterative model-based method is proposed for improving linguistic structure, segmentation, and prosodic annotations that correspond to the delivery of each utterance as regularized across the data. For each iteration, the training utterances are resynthized according to the existing symbolic annotation. Values of various features and subgraph structures are "twiddled:" each is perturbed based on the features and constraints of the model. Twiddled utterances are evaluated using an objective function appropriate to the type of perturbation and compared with the unmodified, resynthesized utterance. The instance with least error is assigned as the current annotation, and the entire process is repeated. At each iteration, the model is re-estimated, and the distributions and annotations regularize across the corpus. As a result, the annotations have more accurate and effective distributions, which leads to improved control and expressiveness given the features of the model.

Thesis Committee
Alan W. Black (Chair)
Jack Mostow
Alex Rudnicky
Julia Hirschberg (Columbia University)

Copy of Draft Thesis Document

Air quality has long been a major health concern for citizens around the world, and increased levels of exposure to fine particulate matter (PM2.5) has been definitively linked to serious health effects such as cardiovascular disease, respiratory illness, and increased mortality.  PM2.5  is one of six attainment criteria pollutants used by the EPA, and is similarly regulated by many other governments worldwide. Unfortunately, the high cost and complexity of most current PM2.5 monitors results in a lack of detailed spatial and temporal resolution, which means that concerned individuals have little insight into their personal exposure levels. This is especially true regarding hyper-local variations and short-term pollution events associated with industrial activity, heavy fossil fuel use, or indoor activity such as cooking.

Advances in sensor miniaturization, decreased fabrication costs, and rapidly expanding data connectivity have encouraged the development of small, inexpensive devices capable of estimating  PM2.5  concentrations. This new class of sensors opens up new possibilities for personal exposure monitoring. It also creates new challenges related to calibrating and characterizing inexpensively manufactured sensors to provide the level of precision and accuracy needed to yield actionable information without significantly increasing device cost.

This thesis addresses the following two primary questions:

  1. Can an inexpensive air quality monitor based on mass-manufactured dust sensors be calibrated efficiently in order to achieve inter-device agreement in addition to agreement with professional and federally-endorsed particle monitors?
  2. Can an inexpensive air quality monitor increase the confidence and capacity of individuals to understand and control their indoor air quality?

In the following thesis, we describe the development of the Speck fine particulate monitor. The Speck processes data from a low-cost dust sensor using a Kalman filter with a piecewise sensing model. We have optimized the parameters for the algorithm through short-term co-location tests with professional HHPC-6 particle counters, and verified typical correlations between the Speck and HHPC-6 units of r2 > 0.90. To account for variations in sensitivity, we have developed a calibration procedure whereby fine particles are aerosolized within an open room or closed calibration chamber. This allows us to produce Specks for commercial distribution as well as the experiments presented herein.

Drawing from previous pilot studies, we have distributed low-cost monitors through local library systems and community groups. Pre-deployment and post-deployment surveys characterize user perception of personal exposure and the effect of a low-cost fine particulate monitor on empowerment.

Thesis Committee:
Illah Nourbakhsh (Chair)
Aaron Steinfeld
Albert Presto
James Longhurst (University of West England, Bristol)

Copy of Draft Thesis Document


Subscribe to RI