Loose, granular terrain can cause rovers to slip and sink, inhibiting mobility and sometimes even permanently entrapping a vehicle. Traversability of granular terrain is difficult to foresee using traditional, non-contact sensing methods, such as cameras and LIDAR. This inability to detect loose terrain hazards has caused significant delays for rovers on both the Moon and Mars and, most notably, contributed to Spirit's permanent entrapment in soft sand on Mars. These delays are caused both by slipping in unidentified loose sand and by wasting time analyzing or completely circumventing benign sand. Reliable prediction of terrain traversability would greatly improve both the safety and the operational speed of planetary rover operations. This thesis leverages thermal inertia measurements and physics-based terramechanics models to develop algorithms for slip prediction in planetary granular terrain.

The ability of a rover to traverse granular terrain is a complex function of the geometry of the terrain, the rover's configuration, and the physical properties of the granular material, such as density and particle geometry. Vision-based traversability prediction methods are inherently limited. Subsurface characteristics are not exclusively correlated with visual appearance of the surface layer. Vision does not provide enough information to fully understand all the physical properties that influence mobility. The inherent difficulty of estimating traversability is compounded by the conservative nature of planetary rover operations. Mission operators actively avoid potentially hazardous regions, which makes strictly data-driven regression approaches difficult due to limited data.

Pre-proposal research has shown that thermal inertia is correlated to and improves estimates of traversability. This has been demonstrated both in terrestrial experiments and by using data from the Curiosity rover. Unlike visual appearance, thermal properties of a material are not only influenced by the surface of terrain but also by the physical properties of the underlying material. This thesis develops techniques for predicting the traversability of terrain by leveraging thermal inertia measurements to provide a greater understanding of material properties both at and below the surface.

The proposed research will develop computationally efficient traversability prediction technologies. Thermal inertia and geometric features, such as angle of repose, will be used to estimate granular terrain properties. Then surface geometry and soil parameters will be used as inputs to a learning-based slip prediction algorithm. The algorithm will be trained on both in-situ and synthetic data to reduce overfitting and increase prediction accuracy. Synthetic data will be generated using state-of-the-art terramechanics simulators that produce accurate slip estimates given known terrain properties but are too computationally inefficient to be used for tactical rover planning. Evaluation will occur on data from the Mars rovers. Results will be compared to vision-only methods in order to understand in what situations the addition of thermal inertia can improve traversability prediction.

Thesis Committee:
William "Red" Whittaker (Chair)
David Wettergreen
Steven Nuske
Issa Nesnas (Jet Propulsion Laboratory)

The last decade has seen remarkable advances in 3D perception for robotics. Advances in range sensing and SLAM now allow robots to easily acquire detailed 3D maps of their environment in real-time.

However, adaptive robot behavior requires an understanding the environment that goes beyond pure geometry. A step above purely geometric maps are so-called semantic maps, which incorporate task-oriented semantic labels in addition to 3D geometry. In other words, a map of what is  where . This is a straightforward representation that allows robots to use semantic labels for navigation and exploration planning.

In this proposal we develop learning-based approaches for semantic mapping with image and range sensors. We make three main contributions.

In our first contribution, which is completed work, we developed VoxNet, a system for accurate and efficient semantic classification of 3D point cloud data. The key novelty in this system is the integration of volumetric occupancy maps with spatially 3D Convolutional Neural Networks (CNNs). The system showed state-of-the-art performance in 3D object recognition and helicopter landing zone detection.

In our second contribution, motivated by the complementary information in image and point cloud data, we propose a CNN architecture fusing both modalities. The architecture consists of two interconnected streams: a volumetric CNN stream for the point cloud data, and a more traditional 2D CNN stream for the image data. We will evaluate this architecture for the tasks of terrain classification and obstacle detection in an autonomous All Terrain Vehicle (ATV).

In the final contribution, we propose a semantic mapping system for intelligent information gathering on Micro Aerial Vehicles (MAVs). In pursuit of a lightweight solution, we forego active range sensing and use monocular imagery as our main data source. This leads to various challenges, as we now must infer *where* as well as *what*. We outline our plan to solve these challenges using monocular cues, inertial sensing, and other information available to the vehicle.

Thesis Committee:
Sebastian Scherer (Chair)
Martial Hebert
Abhinav Gupta
Raquel Urtasun (University of Toronto)

Copy of Proposal Document

The students from the Double-Major in Robotics will demonstrate the robots they have built as part of the Capstone Course (16-474). Teams of students followed a Systems Engineering process to develop functional robots that meet specific performance requirements.

This year's projects include:

  • An autonomous airline trolley beverage cart
  • A mobile robot for autonomous detection and elimination of weeds
  • An autonomous robotic cart that follows a user while carrying the user's tools and other loads

Please stop by to see the demonstrations and talk to the teams!

Faculty Host: Dimi Apostolopoulos

Nine RI MRSD program student teams will use posters, videos, and hardware to show their project work on robots for package delivery, river taxi service, wellhead servicing, the Amazon Picking Challenge, undersea docking, 3D printing with COTS part inclusion, swarm-based facial recognition, autonomous parking, and autonomous landing on a moving shipdeck.

Faculty Host: John Dolan

Please visit the posters and learn about the exciting projects students are working on as part of the graduate course on Computer Vision.

Faculty Host: Deva  Ramanan

Please visit the posters and learn about the exciting projects students are working on as part of the graduate course on Computer Vision.

Faculty Host: Deva Ramanan

Regularities with varying form and scale pervade our natural and man-made world. From insects to mammals, the ability to sense regular patterns has a neurobiological basis and has been observed in many levels of intelligence and behavior. From Felix Kleinʼs Erlanger program, D’Arcy Thompson’s Growth-and-Form, to the Gestalt principles of perception, much of our understanding of the world is based on the perception and recognition of repeated patterns, generalized by the mathematical concept of symmetry and symmetry groups. Given the ubiquity of symmetry in both the physical and the digital worlds, a computational model for symmetry-based regularity perception is especially pertinent to computer vision, computer graphics, and machine intelligence in general, where an intelligent being (e.g. a robot) seeks to perceive, reason and interact with the chaotic world in the most effective and efficient manner. Surprisingly, we have limited knowledge on how humans perceive regular patterns and little progress has been made in computational models for noisy, albeit near-regular patterns in real data. In this talk, I present parallels as well as differences between machine perception and human perception of visual regularity. I shall report our recent results on understanding human perception of wallpaper patterns using neuroimaging (EEG, fMRI), and our successful attempt at building a symmetry-based Turing test to tell humans and robots apart.

Yanxi Liu received her B.S. degree in physics/electrical engineering (Beijing, China), her Ph.D. degree in computer science for group theory applications in robotics (University of Massachusetts, Amherst, US), and her postdoctoral training in the robotics lab of LIFIA/IMAG (Grenoble, France). Before joining the Robotics Institute of Carnegie Mellon Institute in 1996 she spent one year at DIMACS (NSF center for Discrete Mathematics and Theoretical Computer Science) under an NSF research-education fellowship award.

Dr. Liu worked for ten years at CMU as a research faculty member and served in the CMU faculty senate for multiple years. From June 2013 to August 2014, Dr. Liu visited Microsoft Silicon Valley, Google Mountain View and Stanford University, resulted in one pending and one granted patent. Currently, Dr. Liu is a full professor with the School of Electrical Engineering and Computer Science at Penn State University, where she co-directs the lab for perception, action and cognition (LPAC), and the Human Motion Lab for Taiji Research. Dr. Liu's research interests span a wide range of applications including computer vision, computer graphics, robotics, human perception and computer aided diagnosis in medicine, with one central theme: computational regularity.

Currently, Dr. Liu serves as an associate editor for IEEE Transaction of Pattern Analysis and Machine Intelligence (PAMI) and the Journal of Computer Vision and Image Understanding (CVIU). She will be the program co-chair for the 2017 CVF/IEEE Computer Vision and Pattern Recognition (CVPR) Conference.

Faculty Host: Simon Lucey

Robotics and automation technologies hold immense promise in transforming people’s lives across various communities around the globe. However, there exists a huge disconnect between what is possible from an engineering and scientific viewpoint and what the expectations of the general public are. The problem lies in the fact that we have not seen many practical solutions that can be deployed in a truly useful and effective fashion towards making a difference in the quality of lives of people.

In this talk, I will describe my current work focusing on the applied use of robotics and automation technologies for the benefit of under-served and under-developed communities by working closely with them to sustain developed solutions. This is made possible by bringing together researchers, practitioners from industry, academia, local governments, and various entities such as the IEEE Robotics Automation Society’s Special Interest Group on Humanitarian Technology (RAS-SIGHT), NGOs, and NPOs across the globe.

I will discuss a demining challenge that I have co-organized for the last two years with the intent of producing an open-source solution for detecting and classifying unexploded ordnance buried in minefields. I will also outline my recent efforts in the technology and public policy domains with emphasis on socio-economic, cultural, privacy, and security issues in developing and developed economies.

Raj Madhavan is the Founder & CEO of Humanitarian Robotics Technologies, LLC, Maryland, U.S.A. and a Distinguished Visiting Professor of Robotics with AMMACHI Labs at Amrita University, Kerala, India. He has held appointments with the Oak Ridge National Laboratory (March 2001-January 2010) as an R&D staff member based at the National Institute of Standards and Technology (March 2002-June 2013), and as an assistant and associate research scientist, and as a member of the Maryland Robotics Center with the University of Maryland, College Park (February 2010-December 2015). He received a Ph.D. in Field Robotics from the University of Sydney and an ME (Research) in Systems Engineering from the Australian National University.

Over the last 20 years, he has contributed to topics in field robotics, and systems and control theory. His current research interests lie in humanitarian robotics and automation – the application and tailoring of existing and emerging robotics and automation technologies for the benefit of humanity in a variety of domains, including unmanned (aerial, ground) vehicles in disaster scenarios. He has authored over 185 papers in archival journals, conferences, and magazines including three books and four journal special issues.

Dr. Madhavan is particularly interested in the development of technologies and systems that are cost effective, reliable, efficient and geared towards improving the quality of lives of people in under-served and underdeveloped communities around the globe. Within the IEEE Robotics and Automation Society, he served as the Founding Chair of the Technical Committee on Performance Evaluation and Benchmarking of Robotics and Automation Systems, TC-PEBRAS (2009-2011), Founding Chair of the Humanitarian Robotics and Automation Technology Challenge, HRATC (2014, 2015), Vice President of the Industrial Activities Board (2012-2015), Chair of the Standing Committee for Standards Activities (2010-2015)and since 2012 is the Founding Chair of the Special Interest Group on Humanitarian Technology (RAS-SIGHT). He is the 2016 recipient of the IEEE RAS Distinguished Service Award for his “distinguished service and contributions to RAS industrial and humanitarian activities”.

Faculty Host: M. Bernardine Dias

This research addresses the modeling of substantially 3D planetary terrain features, such as skylights, canyons, craters, rocks, and mesas, by a surface robot. The sun lights planetary features with transient, directional illumination. For concave features like skylight pits, craters, and canyons, this can lead to dark shadows. For all terrain features, the ability to detect interest points and to match them between images is complicated by changing illumination, so seeing planetary features in the best light requires a coordinated dance with the sun as it arcs across the sky.

The research develops a process for planned-view-trajectory model building that converts a coarse model of a terrain feature and knowledge about illumination change and mission parameters into a view trajectory planning problem, plans and executes a view trajectory, and builds a detailed model from the captured images. An understanding of how view and illumination angles affect model quality is reached through controlled lighting laboratory experiments. The planning of view trajectories, (what to image, from where, and at what time), is cast as an OPTWIND, a new vehicle routing problem formulated in the thesis work. Each part of the planned-view-trajectory model building process is examined in detail, with existing tools identified and tested to solve parts of the problem, where appropriate, and new solutions implemented otherwise. The research also demonstrates planned-view-trajectory model building.

Contributions of the research include development, implementation, and demonstration of planned-view-trajectory model building, experimental determination of factors affecting model quality and formulation of view trajectory planning as a new vehicle routing problem. Datasets for planetary analog terrain and for imaging under directional lighting were also collected, totaling over 25,000 images.

Thesis Committee
William L. Red Whittaker (Chair)
David Wettergreen
Michael Kaess
Jeremy Frank (NASA Ames)

Copy of the Thesis Document

Facial actions speak louder than words. Facial actions can reveal a person's emotion, intention, and physical state; and make possible a range of applications that include market research, human-robot interaction, drowsiness detection, and clinical and developmental psychology research. In this proposal, we investigate both supervised and unsupervised approaches to facial action discovery.

Supervised approaches seek to train and validate classifiers for facial action detection. This task is challenging, in part, for two major reasons. First, classifiers must generalize to previously unknown subjects that may differ markedly in behavior, facial morphology, and the recording environment. To address this problem, we propose Selective Transfer Machine (STM), a transductive learning method that personalizes generic classifiers for facial expression analysis. By personalizing the classifier, STM is able to generalize better than state-of-the-art approaches to unseen subjects. In addition, the STM framework can incorporate partly labeled data from a test subject.

Second, supervised learning typically uses hand-crafted, a priori features, such as Gabor, HOG and SIFT together with independent approaches to classifier training (eg, SVM). Recent research suggests that an alternative approach to feature selection integrated with an alternative learning paradigm (Deep Learning) may provide superior performance and greatly reduce or eliminate the problem of domain transfer. Given a renewed number of more than 0.5 million frames annotated in our GFT and BP4D+ datasets, this is the golden era for this exploration.

This thesis will test the hypothesis that Deep Learning enables greater accuracy relative to baseline SVMs and domain transfer approaches, normalizing by the number of independent parameters.

A major limitation of supervised approaches, including Deep Learning, is the collection of annotations, which can be time-consuming, error-prone, and limited to detection phenomena described in other contexts by observers. We explore for the first time, the use of unsupervised approaches for facial action discovery. In particular, we introduce the Common Event Discovery (CED) problem, which, in an unsupervised manner, discovers correlated facial actions from a set of videos. An exhaustive approach to find such facial actions has a quartic complexity in the length of videos, and thus impractical. This thesis proposes an efficient branch-and-bound (B&B) method that guarantees a global optimal solution. We will evaluate CED in three human interaction tasks: video recorded three-person social interactions and parent-infant interaction, and motion captured body movement. We hypothesize that CED will have moderate convergence with supervised approaches, and identify novel patterns in intra- and interpersonal actions occult to supervised approaches.

Thesis Committee
Fernando De la Torre (Co-chair)
Jeffrey F. Cohn (Co-chair)
Simon Lucey
Deva Ramanan
Vladimir Pavlovic (Rutgers University)

Copy of the Thesis Document


Subscribe to RI