Open World Vision

June 20th, held in conjunction with CVPR 2021

Recorded Video is available on YouTube! Thank you all!

EvalAI challenge interface is re-opened for interested researchers to test algorithms.


Computer vision algorithms are often developed inside a closed-world paradigm, for example recognizing objects from a fixed set of categories. However, our visual world is naturally open, containing situations that are dynamic, vast, and unpredictable. Algorithms developed on closed-world datasets are often brittle once exposed to the realistic complexity of the open world, where they are unable to efficiently adapt and robustly generalize. We invite researchers in perception and learning to the Workshop on Open World Vision where we will investigate, along several directions, both the challenges and opportunities for open world vision systems.


This workshop@CVPR2021 aims to bring together vision researchers and practitioners from both academia and industry interested in addressing open-world vision problems. The domains include but are not limited to learning and deploying vision algorithms in the open world, security and safety of embodied vision, realistic setups and datasets, etc.

  • Real open world data: long-tail distribution, streaming data, data bias, anomalous inputs, multi-modality, etc.
  • Learning/problems: X-shot learning, Y-supervised learning, lifelong/continual learning, meta-learning, domain adaptation/generalization, open-set recognition, etc.
  • Social Impact: safety, fairness, real-world applications, inter-disciplinary research, etc.
  • Misc: datasets, benchmark, uncertainty, interpretability, robustness, generalization, etc.


Let's consider the following motivational examples.

  • Open-world data follows long-tail distribution. When should we treat heavily-tailed classes and dominant classes equally or differently? In autonomous driving, we simply group "dogs" and "racoons" into an "animal" super-class because animal types do not necessarily change motion plans for obstacle avoidance (e.g., "stop" or "slow-down"). In contrast, an elderly-assistive robot may have to recognize rarely-seen kitchenware (e.g., water cups and coffee mugs).
  • Open-world contains unknown examples. Because of the nature of long-tail distributions, one typically constructs a closed-world training set by first defining an ontology (e.g., the vocabulary of semantic labels) w.r.t dominant/common classes, and then ignoring the "tail" uncommon/rare classes, many of which may not even be seen during data collection. Examples from these tail classes are unknown open-set data to a model trained on that closed-world dataset. An autonomous vehicle may not recognize a rarely-seen stroller, a recommendation system may be confronted with fake imagery that is adversarially generated. How should a model trained in the closed-world dataset respond to open-set unknown examples?
  • Open-world requires our limited ontology of labels to be evolving. We take the view that ontologies (e.g., object vocabulary for driving safety and clothing styles vocabulary) evolve over time. For example, after deploying an autonomous driving system, we would like to differentiate "police-officers" and "pedestrians" (both of which are "persons"), and define a new class "vulnerable dynamic objects" to include "strollers" and "wheelchairs". In a product-oriented vision system, we would like to keep expanding fashion vocabulary as clothing and styles are dynamically changing over time. Given these, how should we train our model with continually-updated ontologies? Should we practice continual learning and lifelong learning to address this open-world scenario?


Terrance Boult
University of Colorado Colorado Springs
Thomas G. Dietterich
Oregon State University

Alexei (Alyosha) Efros
University of California, Berkeley
Ali Farhadi
University of Washington

Derek Hoiem
University of Illinois at Urbana-Champaign
Shu Kong
Carnegie Mellon University

Kate Saenko
Boston University and MIT-IBM Watson AI Lab
Abhinav Shrivastava
University of Maryland

Yu-Xiong Wang
University of Illinois at Urbana-Champaign

Stella Yu
University of California, Berkeley


Please contact Shu Kong with any questions: shuk [at] andrew [dot] cmu [dot] edu

Shu Kong
Carnegie Mellon University

Deva Ramanan
Carnegie Mellon University

Terrance Boult
University of Colorado Colorado Springs
Andrew Owens
University of Michigan

Carl Vondrick
Columbia University

Yu-Xiong Wang
University of Illinois at Urbana-Champaign
Abhinav Shrivastava
University of Maryland

Important Dates and Details

  • Signup to receive updates: using this form
  • date to release challenge train-val set (43GB): March 25, 2021.
    • Please refer to the user guide for detailed description of the data.
    • To get data, please accept the Term of Access through filling in this google form
  • date to release challenge test set (3.8GB) and EvalAI: May 17, 2021. (sorry for the delay!)
  • Challenge submission deadline: May 31, 2021 at 11:59pm PST.
  • Workshop date: June 20, 2021
  • YouTube livestream url:

Oral presentation will be selected from challenge participants, e.g., winners and those having innovative ideas.


We provide a teaser challenge and will announce challenge results at this workshop. Please refer to user guide for more details.

  • Open-set image classification requires a model to distinguish novel, anomalous and semantically unknown (e.g., open-set) test-time examples.

Results will be submitted and evaluated through EvalAI.

Program Schedule

YouTube livestream url:
Time (Pacific Time, UTC-7)
09:30 - 09:45
Opening remarks
Shu Kong, CMU
Open World Vision
09:45 - 10:15
Invited talk #1
Terry Boult, UCCS
Quo Vadis Open World Learning?
10:15 - 10:45
Invited talk #2
Thomas G. Dietterich, OSU
A Representation Analysis of Image Anomaly Detection
10:45 - 11:15
Invited talk #3
Alexei (Alyosha) Efros, UCB
Open World Must Self-Supervise
11:15 - 11:45
Invited talk #4
Ali Farhadi, UW
Overfitting to Conventions?
11:45 - 12:15
Lunch break
12:15 - 12:45
Invited talk #5
Derek Hoiem, UIUC
Three Big Challenges for General Purpose Vision
12:45 - 13:15
Invited talk #6
Kate Saenko, Boston University and MIT-IBM Watson AI Lab
Open-World Recognition with Distributional Shift
13:15 - 13:45
Invited talk #7
Stella Yu, UCB
Bias, Variance, and Correlation For Open Long-Tailed Recognition
13:45 - 14:15
Live panel discussion
Panelists: Stella Yu, Terry Boult, Tom Dietterich, Carl Vondrick, Abhinav Shrivastava, Yu-Xiong Wang, Shu Kong
14:15 - 14:45
Afternoon break
14:45 - 15:15
Invited talk #8
Yu-Xiong Wang, UIUC
Learning to Learn More with Less
15:15 - 15:45
Invited talk #9
Abhinav Shrivastava, UMD
Reviving Object Discovery: Where from & Where to?
15:45 - 16:15
Invited talk #10
Challenge Participant: Yingwei Pan, JD AI Research
16:15 - 16:45
Invited talk #11
Challenge Participant: Doyup Lee, POSTECH & Kakao Brain
Team Contrastive
16:45 - 17:15
Invited talk #11
Challenge Participant: Decheng Gao, Hikvision Research Institute
17:15 - 17:30
Closing remarks