Open World Vision

June 20th, held in conjunction with CVPR 2021


Computer vision algorithms are often developed inside a closed-world paradigm, for example recognizing objects from a fixed set of categories. However, our visual world is naturally open, containing situations that are dynamic, vast, and unpredictable. Algorithms developed on closed-world datasets are often brittle once exposed to the realistic complexity of the open world, where they are unable to efficiently adapt and robustly generalize. We invite researchers in perception and learning to the Workshop on Open World Vision where we will investigate, along several directions, both the challenges and opportunities for open world vision systems.


This workshop@CVPR2021 aims to bring together vision researchers and practitioners from both academia and industry interested in addressing open-world vision problems. The domains include but are not limited to learning and deploying vision algorithms in the open world, security and safety of embodied vision, realistic setups and datasets, etc.

  • Real open world data: long-tail distribution, streaming data, data bias, anomalous inputs, multi-modality, etc.
  • Learning/problems: X-shot learning, Y-supervised learning, lifelong/continual learning, meta-learning, domain adaptation/generalization, open-set recognition, etc.
  • Social Impact: safety, fairness, real-world applications, inter-disciplinary research, etc.
  • Misc: datasets, benchmark, uncertainty, interpretability, robustness, generalization, etc.


Let's consider the following motivational examples.

  • Open-world data follows long-tail distribution. When should we treat heavily-tailed classes and dominant classes equally or differently? In autonomous driving, we simply group "dogs" and "racoons" into an "animal" super-class because animal types do not necessarily change motion plans for obstacle avoidance (e.g., "stop" or "slow-down"). In contrast, an elderly-assistive robot may have to recognize rarely-seen kitchenware (e.g., water cups and coffee mugs).
  • Open-world contains unknown examples. Because of the nature of long-tail distributions, one typically constructs a closed-world training set by first defining an ontology (e.g., the vocabulary of semantic labels) w.r.t dominant/common classes, and then ignoring the "tail" uncommon/rare classes, many of which may not even be seen during data collection. Examples from these tail classes are unknown open-set data to a model trained on that closed-world dataset. An autonomous vehicle may not recognize a rarely-seen stroller, a recommendation system may be confronted with fake imagery that is adversarially generated. How should a model trained in the closed-world dataset respond to open-set unknown examples?
  • Open-world requires our limited ontology of labels to be evolving. We take the view that ontologies (e.g., object vocabulary for driving safety and clothing styles vocabulary) evolve over time. For example, after deploying an autonomous driving system, we would like to differentiate "police-officers" and "pedestrians" (both of which are "persons"), and define a new class "vulnerable dynamic objects" to include "strollers" and "wheelchairs". In a product-oriented vision system, we would like to keep expanding fashion vocabulary as clothing and styles are dynamically changing over time. Given these, how should we train our model with continually-updated ontologies? Should we practice continual learning and lifelong learning to address this open-world scenario?


Thomas G. Dietterich
Oregon State University
Ali Farhadi
University of Washington
Kate Saenko
Boston University
Stella Yu
UC Berkeley


Please contact Shu Kong with any questions: shuk [at] andrew [dot] cmu [dot] edu

Carl Vondrick
Columbia University

Important Dates and Details

  • Signup to receive updates: using this form
  • date to release challenge train-val sets: March 1st, 2021.
  • date to release challenge test set: March 29, 2021.
  • Challenge submission deadline: April 16, 2021 at 11:59pm PST.
  • Workshop date: June 20, 2021

Oral presentation will be selected from challenge participants, e.g., winners and those having innovative ideas.


We provide a teaser challenge and will announce challenge results at this workshop.

  • Open-set image classification requires a model to distinguish novel, anomalous and semantically unknown (e.g., open-set) test-time examples.

Results will be submitted and evaluated through EvalAI.

Program Schedule

Time (Pacific Time, UTC-7)
09:30 - 09:45
Opening remarks
Shu Kong, CMU
Open World Vision
09:45 - 10:15
Invited talk #1
Terry Boult, UCCS
title TBA
10:15 - 10:45
Invited talk #2
Thomas G. Dietterich, OSU
A Representation Analysis of Image Anomaly Detection
10:45 - 11:15
Invited talk #3
Alexei (Alyosha) Efros, UCB
title TBA
11:15 - 11:45
Invited talk #4
Ali Farhadi, UW
title TBA
11:45 - 12:15
Lunch break
12:15 - 12:45
Invited talk #5
Derek Hoiem, UIUC
title TBA
12:45 - 13:15
Invited talk #6
Kate Saenko, BU
title TBA
13:15 - 13:45
Invited talk #7
Rahul Sukthankar, Google
title TBA
13:45 - 14:15
Live panel discussion
  • XXX
  • XXX
  • XXX
  • XXX
  • XXX
Moderated by organizers
14:15 - 14:45
Afternoon break
14:45 - 15:15
Invited talk #8
Yu-Xiong Wang, UIUC
title TBA
15:15 - 15:45
Invited talk #9
Stella Yu, UCB
title TBA
15:45 - 16:15
Invited talk #10
Challenge Participants, TBA
title TBA
16:15 - 16:45
Invited talk #11
Challenge Participants, TBA
title TBA
16:45 - 17:00
Closing remarks