RI

Mobile robots are increasingly being deployed in the real world in response to a heightened demand for applications such as transportation, delivery and inspection. The motion planning systems for these robots are expected to have consistent performance across the wide range of scenarios that they encounter. While state-of-the art planners can be adapted to solve these real-time kinodynamic planning problems, their performance varies vastly across diverse scenarios. This thesis proposes that the motion planner for a mobile robot must adapt its search strategy to the distribution over planning problems that the robot encounters.

We address three principal challenges of this problem. Firstly, we show that even when the planning problem distribution is fixed, designing a non-adaptive planner can be challenging due to the unpredictability of its performance. We discuss how to alleviate this issue by leveraging a diverse ensemble of planners. Secondly, when the distribution is varying, we require a meta-planner that can use context to automatically select an ensemble from a library of black-box planners. We show both theoretically and empirically that greedily training a list of predictors to focus on failure cases leads to an effective meta-planner. Finally, in the interest of computational efficiency, we want a white-box planner that adapts its search strategy during a planning cycle. We show how such a strategy can be trained efficiently in a data-driven imitation learning framework.

Based on our preliminary investigations, we propose to examine three sub-problems that will lead to an effective adaptive motion planning framework. The first is learning heuristic and collision checking policies that optimize search effort by adapting to the distribution of obstacles in the environment. The second is to train context efficient meta-planners that use planner performance as additional feedback. The third is to automatically deal with failure cases that occur during online execution.

We evaluate the efficacy of our framework on a spectrum of motion planning problems with a primary focus on an autonomous full-scale helicopter. We expect that our framework will enable mobile robots to navigate seamlessly across different missions without the need for human intervention.

Thesis Committee:
Sebastian Scherer (Chair)
Siddhartha Srinivasa
Martial Hebert
Ashish Kapoor (Microsoft Research)

Copy of Proposal Document

Mine rescue robots rely heavily on visual cameras and LIDAR, which succeed in clear conditions. However, the worst mine disasters are caused by roof falls, explosions, and fires, which generate thick dust and smoke that obscure traditional sensing and thwart robot perception.

This talk presents the failures of traditional sensing techniques in smoky conditions. Prior classical work investigated sonar, radar, and LIDAR. Sonar is unaffected by smoke but is only useful for safeguarding and hyperlocal navigation. Radar is coarse and lacks the resolution required for robot navigation. LIDAR and cameras are the sensors of choice, and prior perception methods include LIDAR-camera fusion for mapping, SLAM for exploratory modeling, and loop closure for navigating mine corridor networks. However, cameras and LIDAR—and hence their associated navigation methodologies—fail in heavy smoke.

This presentation introduces the merits of thermal and Episcan3D sensing in these environments. The Episcan3D is a new class of sensor that improves viewing and range sensing through light smoke. Rather than broad illumination, which creates whiteout in a smoky scene, the Episcan3D illuminates with only a single ray at a time, which reduces scatter. Yet, even the Episcan3D is obscured in heavy smoke. Thermal imaging is not obscured by smoke but succeeds best outdoors where large thermal gradients exist. Underground mines are isolated from large fluctuations in temperature, so thermal features are often too indistinct and sparse for traditional SLAM. This research specialized direct SLAM methods for operation on thermal imagery and evaluated their suitability for robot navigation in underground mines.

During this research, a multi-modal dataset was collected for future work toward robotic underground mine rescue. A rover carrying visual, thermal, inertial, and LIDAR sensors was deployed and driven through a smoke-filled mine. Applications of this dataset for future research include thermal SLAM, subterranean navigation, multi-modal mapping, sensor fusion, and victim identification.

Joe Bartels is a Ph.D. student in the Robotics Institute. His current research interests include sensing and perception for visually degraded environments. Prior to attending Carnegie Mellon University, Joe completed a B.S. and M.S. in Mechanical Engineering at the University of Nebraska-Lincoln.

Presented as part of the Robotics Institute Speaking Qualfier.

An iterative model-based method is proposed for improving linguistic structure, segmentation, and prosodic annotations that correspond to the delivery of each utterance as regularized across the data. For each iteration, the training utterances are resynthized according to the existing symbolic annotation. Values of various features and subgraph structures are "twiddled:" each is perturbed based on the features and constraints of the model. Twiddled utterances are evaluated using an objective function appropriate to the type of perturbation and compared with the unmodified, resynthesized utterance. The instance with least error is assigned as the current annotation, and the entire process is repeated. At each iteration, the model is re-estimated, and the distributions and annotations regularize across the corpus. As a result, the annotations have more accurate and effective distributions, which leads to improved control and expressiveness given the features of the model.

Thesis Committee
Alan W. Black (Chair)
Jack Mostow
Alex Rudnicky
Julia Hirschberg (Columbia University)

Copy of Draft Thesis Document

Air quality has long been a major health concern for citizens around the world, and increased levels of exposure to fine particulate matter (PM2.5) has been definitively linked to serious health effects such as cardiovascular disease, respiratory illness, and increased mortality.  PM2.5  is one of six attainment criteria pollutants used by the EPA, and is similarly regulated by many other governments worldwide. Unfortunately, the high cost and complexity of most current PM2.5 monitors results in a lack of detailed spatial and temporal resolution, which means that concerned individuals have little insight into their personal exposure levels. This is especially true regarding hyper-local variations and short-term pollution events associated with industrial activity, heavy fossil fuel use, or indoor activity such as cooking.

Advances in sensor miniaturization, decreased fabrication costs, and rapidly expanding data connectivity have encouraged the development of small, inexpensive devices capable of estimating  PM2.5  concentrations. This new class of sensors opens up new possibilities for personal exposure monitoring. It also creates new challenges related to calibrating and characterizing inexpensively manufactured sensors to provide the level of precision and accuracy needed to yield actionable information without significantly increasing device cost.

This thesis addresses the following two primary questions:

  1. Can an inexpensive air quality monitor based on mass-manufactured dust sensors be calibrated efficiently in order to achieve inter-device agreement in addition to agreement with professional and federally-endorsed particle monitors?
  2. Can an inexpensive air quality monitor increase the confidence and capacity of individuals to understand and control their indoor air quality?

In the following thesis, we describe the development of the Speck fine particulate monitor. The Speck processes data from a low-cost dust sensor using a Kalman filter with a piecewise sensing model. We have optimized the parameters for the algorithm through short-term co-location tests with professional HHPC-6 particle counters, and verified typical correlations between the Speck and HHPC-6 units of r2 > 0.90. To account for variations in sensitivity, we have developed a calibration procedure whereby fine particles are aerosolized within an open room or closed calibration chamber. This allows us to produce Specks for commercial distribution as well as the experiments presented herein.

Drawing from previous pilot studies, we have distributed low-cost monitors through local library systems and community groups. Pre-deployment and post-deployment surveys characterize user perception of personal exposure and the effect of a low-cost fine particulate monitor on empowerment.

Thesis Committee:
Illah Nourbakhsh (Chair)
Aaron Steinfeld
Albert Presto
James Longhurst (University of West England, Bristol)

Copy of Draft Thesis Document

Visual Recognition has seen tremendous advances in the last decade. This progress is primarily due to learning algorithms trained with two key ingredients: large amounts of data and extensive supervision. While acquiring visual data is cheap, getting it labeled is far more expensive. So how do we enable learning algorithms to harness the sea of visual data available freely, without worrying about costly supervision?

Interestingly, our visual world is extraordinarily varied and complex, but despite its richness, the space of visual data may not be that astronomically large. We live in a well-structured, predictable world, where cars almost always drive on roads, sky is always above the ground, and so on; and these regularities can provide the missing ingredients required to scaling up our visual learning algorithms. This thesis aims to develop algorithms that: 1) discover this implicit and explicit structure in visual data, and 2) leverage the regularities to provide necessary constraints that facilitate large-scale visual learning. In particular, we propose a two-pronged strategy to enable large-scale recognition.

In Part I, we present algorithms for training better and more reliable supervised recognition models that exploit structure in various flavors of labeled data and target tasks. In Part II, we leverage these visual models and large amounts of unlabeled data to discover constraints, and use these constraints in a semi-supervised learning framework to improve visual recognition.

Thesis Committee:
Abhinav Gupta (Chair)
Alexei A. Efros (University of California, Berkeley)
Martial Hebert
Deva Ramanan
Jitendra Malik (University of California, Berkeley)

Copy of Proposal Document

Real-world sensor data in robotics and related domains is sequential, high-dimensional, noisy, and collected in a raw and unstructured form. In order to interpret, track, predict, or plan with such data, we often assume that it is generated by some underlying dynamical system model. Although we can sometimes use extensive domain knowledge to write down a dynamical system, specifying a model by hand can be a time-consuming process. This motivates an alternative approach: learning the model directly from sensor data. This is a formidable problem. Revealing the dynamical system that governs a complex time series is often not just difficult, but provably intractable. Popular maximum likelihood strategies for learning dynamical system models are slow in practice and often get stuck at poor local optima, problems that greatly limit the utility of these techniques when learning from real-world data. Although these drawbacks were long thought to be unavoidable, recent work has shown that progress can be made by shifting the focus of learning to realistic instances that rule out the intractable cases.

In this talk, I will present an overview of my work on modeling a range of robotics problems as dynamical systems. I will then focus on several related computational approaches for learning dynamical system models directly from high-dimensional sensor data. The key insight is that low-order moments of observed data often possess structure that can be revealed by powerful spectral decomposition methods, and, from this structure, model parameters can be recovered. Based on this insight, we design highly effective algorithms for learning parametric models like Kalman Filters and Hidden Markov Models, as well as nonparametric models via reproducing kernels, and new models based on predictive state representations. Unlike maximum likelihood-based approaches, these new learning algorithms are statistically consistent, computationally efficient, and easy to implement using established linear-algebra techniques. The result is a set of tools for learning dynamical system models with state-of-the-art performance on video, robotics, and biological modeling problems.

Byron Boots is an Assistant Professor in the School of Interactive Computing and the College of Computing at Georgia Tech. He directs the Georgia Tech Robot Learning Lab, which is affiliated with the Center for Machine Learning, the Institute for Data Engineering and Science, and the Institute for Robotics and Intelligent Machines. His research focuses on developing theory and systems that integrate perception, learning, and decision making. His work on learning models of dynamical systems received the Best Paper award at the International Conference for Machine Learning (ICML) in 2010. His research is supported by a NSF CISE Research Initiation Initiative award (CRII), a NSF National Robotics Initiative award (NRI), and BMW Manufacturing. Prior to joining Georgia Tech, Dr. Boots was a postdoctoral researcher working with Dieter Fox in the Robotics and State Estimation Lab at the University of Washington. He received his Ph.D. in Machine Learning from Carnegie Mellon University advised by Geoff Gordon.

Faculty Host: Sidd Srinivasa

Citizen science forges partnerships between experts and citizens through collaboration and has become a trend in public participation in scientific research over the past decade. While public participation has been applied to science education, researchers recently noticed that this strategy can contribute to participatory democracy, which empowers citizens to advocate for their local problems. Such strategy supports citizens to form a community, increase environmental monitoring, gather scientific evidence, and tell convincing stories. Researchers believe that this community-based citizen science strategy can contribute to the wellbeing of communities by giving them power to influence the general public and decision makers.

Community-based citizen science requires collecting, curating, visualizing, analyzing, and interpreting multiple types of data over a large spacetime scale. This is highly dependent on community engagement (i.e. the involvement of citizens in local neighborhoods). Such large-scale tasks require the assistance of innovative computational tools to give technology affordance to communities. However, existing tools often focus on only one type of data, and thus researchers need to develop tools from scratch. Moreover, there is a lack of design patterns for researchers to reference when developing such tools. Furthermore, existing tools are typically treated as products rather than ongoing infrastructures that sustain community engagement. This research studies the methodology of developing computational tools by using visualization and crowdsourcing to support the entire community engagement life cycle, from initiation, maintenance, tracking, to evaluation.

This research will make methodological and empirical contributions to community-based citizen science and human-computer interaction. Methodological contributions include detailed case studies with applied methodologies of information technology systems that are deployed in real-world contexts. Empirical contributions include generalizable empirical insights for developing visualization and crowdsourcing techniques that integrate multiple types of scientific data.

In this proposal, I first define community-based citizen science and explain corresponding design challenges. Then, I review existing computational tools related to this research. Next, I present two completed works: Time Engine and AirWatch Pittsburgh. Time Engine is a web-based timelapse editor for creating guided video tours and interactive slideshows from large-scale imagery datasets. AirWatch Pittsburgh is an air quality monitoring system which integrates heterogeneous data and computer vision to support forming scientific knowledge. In addition, I propose two works: Environmental Health Engine and Smell Pittsburgh. Environmental Health Engine is a visualization and exploratory analysis platform for creating environmental sensing and health data narratives. Smell Pittsburgh is a mobile crowdsourced application for reporting and visualizing pollution odors. I also propose conducting case studies to derive typology about using tools to support community engagement. Finally, I propose organizing insights from all four works and case studies into design patterns, which serve as rubrics for future researchers.

Thesis Committee:
Illah Nourbakhsh (Chair)
Aaron Steinfeld
Jeffrey Bigham
Eric Paulos (University of California, Berkeley)

Copy of Proposal Document

Mobile robots equipped with various sensors can gather data at unprecedented spatial and temporal scales and resolution. Over the years, our group has developed autonomous surface, ground and aerial vehicles for finding and localizing radio tagged animals. Recently, we've also developed robotic systems for yield estimation and farm monitoring. I will give an overview of these projects as well as some of the fundamental algorithmic problems we studied along the way.

Volkan Isler is an Associate Professor in the Computer Science Department at the University of Minnesota. He is a 2009-2012 resident fellow at the Institute on Environment and 2010-2012 McKnight Land-Grant Professor. Previously, he was an Assistant Professor at Rensselaer Polytechnic Institute, and a post-doctoral researcher at CITRIS at UC Berkeley. He obtained his MSE (2000) and PhD (2004) degrees in Computer and Information Science from the University of Pennsylvania. While at Penn, he was a member of the GRASP Lab and the Theory Group. He obtained his BS degree (1999) in Computer Engineering from Bogazici University, Istanbul, Turkey. In 2008, he received the National Science Foundation's Young Investigator Award (CAREER). From 2009 to 2015, he chaired IEEE Society of Robotics and Automation's Technical Committee on Networked Robots. He also served as an Associate Editor for IEEE Transactions on Robotics and IEEE Transactions on Automation Science and Engineering. His research interests are primarily in robotics, computer vision, sensor networks and geometric algorithms, and their applications in agriculture and environmental monitoring.

Faculty Host: Matt Mason

The rapid progress of AI in the last few years are largely the result of advances in deep learning and neural nets, combined with the availability of large datasets and fast GPUs. We now have systems that can recognize images with an accuracy that rivals that of humans. This will lead to revolutions in several domains such as autonomous transportation and medical image understanding. But all of these systems currently use supervised learning in which the machine is trained with inputs labeled by humans. The challenge of the next several years is to let machines learn from raw, unlabeled data, such as video or text. This is known as unsupervised learning. AI systems today do not possess "common sense", which humans and animals acquire by observing the world, acting in it, and understanding the physical constraints of it. Some of us see unsupervised learning as the key towards machines with common sense. Approaches to unsupervised learning will be reviewed. This presentation assumes some familiarity with the basic concepts of deep learning.

Yann LeCun is Director of AI Research at Facebook, and Silver Professor of Dara Science, Computer Science, Neural Science, and Electrical Engineering at New York University, affiliated with the NYU Center for Data Science, the Courant Institute of Mathematical Science, the Center for Neural Science, and the Electrical and Computer Engineering Department. He received the Electrical Engineer Diploma from Ecole Superieure d'Ingenieurs en Electrotechnique et Electronique (ESIEE), Paris in 1983, and a PhD in Computer Science from Universite Pierre et Marie Curie (Paris) in 1987. After a postdoc at the University of Toronto, he joined AT&T Bell Laboratories in Holmdel, NJ in 1988. He became head of the Image Processing Research Department at AT&T Labs-Research in 1996, and joined NYU as a professor in 2003, after a brief period as a Fellow of the NEC Research Institute in Princeton.

From 2012 to 2014 he directed NYU's initiative in data science and became the founding director of the NYU Center for Data Science. He was named Director of AI Research at Facebook in late 2013 and retains a part-time position on the NYU faculty. His current interests include AI, machine learning, computer perception, mobile robotics, and computational neuroscience. He has published over 180 technical papers and book chapters on these topics as well as on neural networks, handwriting recognition, image processing and compression, and on dedicated circuits and architectures for computer perception. The character recognition technology he developed at Bell Labs is used by several banks around the world to read checks and was reading between 10 and 20% of all the checks in the US in the early 2000s. His image compression technology, called DjVu, is used by hundreds of web sites and publishers and millions of users to access scanned documents on the Web. Since the late 80's he has been working on deep learning methods, particularly the convolutional network model, which is the basis of many products and services deployed by companies such as Facebook, Google, Microsoft, Baidu, IBM, NEC, AT&T and others for image and video understanding, document recognition, human-computer interaction, and speech recognition.

LeCun has been on the editorial board of IJCV, IEEE PAMI, and IEEE Trans. Neural Networks, was program chair of CVPR'06, and is chair of ICLR 2013 and 2014. He is on the science advisory board of Institute for Pure and Applied Mathematics, and has advised many large and small companies about machine learning technology, including several startups he co-founded. He is the lead faculty at NYU for the Moore-Sloan Data Science Environment, a $36M initiative in collaboration with UC Berkeley and University of Washington to develop data-driven methods in the sciences. He is the recipient of the 2014 IEEE Neural Network Pioneer Award.

Faculty Host: Martial Hebert

The capacity of aerial robots to autonomously explore, inspect and map their environment is key to many applications. This talk will overview and discuss a set of new -and in their majority open sourced and experimentally verified- sampling-based strategies that break new ground on how a robot can efficiently inspect a structure for which a prior model exists, how to explore unknown environments, and how to actively combine the planning and perception loops to achieve autonomous exploration with maintained levels of 3D mapping fidelity. In particular, we will detail recent developments in the field of active perception and belief-space planning for autonomous exploration. Finally, an overview of further research activities on aerial robotics, including solar-powered unmanned aerial vehicles and aerial manipulators will be provided.

Kostas Alexis obtained his Ph.D. in the field of aerial robotics control and collaboration from the University of Patras, Greece in 2011. His Ph.D. research was supported by the Greek national-European Commission Excellence scholarship. After successfully defending his Ph.D. thesis, he was a awarded a Swiss Government fellowship and moved to Switzerland and ETH Zurich. From 2011 to June 2015 he held the position of senior researcher at the Autonomous Systems Lab, ETH Zurich, leading the lab efforts in the fields of control and path planning for advanced navigational and operational autonomy. His research interests lie in the fields of control, navigation, optimization and path-planning focusing on aerial robotic systems with multiple and hybrid configurations.

He is the author or co-author of more than 50 scientific publications and has received several best paper awards and distinctions, including the IET Control Theory & Applications Premium Award 2014. Furthermore, together with his collaborators, they have achieved world records in the field of solar-powered flight endurance. Kostas Alexis has participated in and organized several large-scale multi-million dollar research projects with broad international involvement and collaboration. In July 2015, Kostas moved to the University of Nevada, Reno with the goal to dedicate his efforts towards establishing true autonomy for aerial and other kinds of robotics.

Faculty Host: Sebastian Scherer

Pages

Subscribe to RI