Open World Vision

June 20th, held in conjunction with CVPR 2021

Recorded Video is available on YouTube! Thank you all!

https://youtu.be/TknWpbXEKeA

EvalAI challenge interface is re-opened for interested researchers to test algorithms.

Overview

Computer vision algorithms are often developed inside a closed-world paradigm, for example recognizing objects from a fixed set of categories. However, our visual world is naturally open, containing situations that are dynamic, vast, and unpredictable. Algorithms developed on closed-world datasets are often brittle once exposed to the realistic complexity of the open world, where they are unable to efficiently adapt and robustly generalize. We invite researchers in perception and learning to the Workshop on Open World Vision where we will investigate, along several directions, both the challenges and opportunities for open world vision systems.

Topics

This workshop@CVPR2021 aims to bring together vision researchers and practitioners from both academia and industry interested in addressing open-world vision problems. The domains include but are not limited to learning and deploying vision algorithms in the open world, security and safety of embodied vision, realistic setups and datasets, etc.

Real open world data: long-tail distribution, streaming data, data bias, anomalous inputs, multi-modality, etc.
Learning/problems: X-shot learning, Y-supervised learning, lifelong/continual learning, meta-learning, domain adaptation/generalization, open-set recognition, etc.
Social Impact: safety, fairness, real-world applications, inter-disciplinary research, etc.
Misc: datasets, benchmark, uncertainty, interpretability, robustness, generalization, etc.

Examples

Let's consider the following motivational examples.

Open-world data follows long-tail distribution. When should we treat heavily-tailed classes and dominant classes equally or differently? In autonomous driving, we simply group "dogs" and "racoons" into an "animal" super-class because animal types do not necessarily change motion plans for obstacle avoidance (e.g., "stop" or "slow-down"). In contrast, an elderly-assistive robot may have to recognize rarely-seen kitchenware (e.g., water cups and coffee mugs).
Open-world contains unknown examples. Because of the nature of long-tail distributions, one typically constructs a closed-world training set by first defining an ontology (e.g., the vocabulary of semantic labels) w.r.t dominant/common classes, and then ignoring the "tail" uncommon/rare classes, many of which may not even be seen during data collection. Examples from these tail classes are unknown open-set data to a model trained on that closed-world dataset. An autonomous vehicle may not recognize a rarely-seen stroller, a recommendation system may be confronted with fake imagery that is adversarially generated. How should a model trained in the closed-world dataset respond to open-set unknown examples?
Open-world requires our limited ontology of labels to be evolving. We take the view that ontologies (e.g., object vocabulary for driving safety and clothing styles vocabulary) evolve over time. For example, after deploying an autonomous driving system, we would like to differentiate "police-officers" and "pedestrians" (both of which are "persons"), and define a new class "vulnerable dynamic objects" to include "strollers" and "wheelchairs". In a product-oriented vision system, we would like to keep expanding fashion vocabulary as clothing and styles are dynamically changing over time. Given these, how should we train our model with continually-updated ontologies? Should we practice continual learning and lifelong learning to address this open-world scenario?