Rich scene understanding from 3-D point clouds is a challenging task that requires contextual reasoning, which is typically computationally expensive. The task is further complicated when we expect the scene analysis algorithm to also efficiently handle data that is continuously streamed from a sensor on a mobile robot. Hence, we are typically forced to make a choice between 1) using a precise representation of the scene at the cost of speed, or 2) making fast, though inaccurate, approximations at the cost of increased misclassifications. In this work, we demonstrate that we can achieve the best of both worlds by using an efficient and simple representation of the scene in conjunction with recent developments in structured prediction in order to obtain both efficient and state-of-the-art classifications. Furthermore, this efficient scene representation naturally handles streaming data and provides a 300% to 500% speedup over more precise representations.



Results from Miksik et al. ICRA'13 paper

Per-frame classifications: [CamVid 05VD] [CamVid 16E5] [NYUScenes] [MPI-VehicleScenes]
Temporally smoothed classifications: [CamVid 05VD] [CamVid 16E5] [NYUScenes] [MPI-VehicleScenes]
File formats: The names and colors of each class index: [CamVid] [NYUScenes] [MPI-VehicleScenes]



ICRA 2013 Efficient 3-D Scene Analysis from Streaming Data
H. Hu, D. Munoz, J. A. Bagnell, M. Hebert
ICRA 2013
[pdf] [supplementary video] [project page] [bibtex]

ICRA 2013 Efficient Temporal Consistency for
Streaming Video Scene Analysis

O. Miksik, D. Munoz, J. A. Bagnell, M. Hebert
ICRA 2013
[pdf] [supplementary video] [project page] [bibtex]

ICRA 2009 Onboard Contextual Classification of 3-D Point
Clouds with Learned High-order Markov Random Fields

D. Munoz, N. Vandapel, M. Hebert
ICRA 2009
[pdf] [project page] [bibtex]