Computer vision systems today fail frequently. They also fail abruptly without warning or explanation. Alleviating the former has been the primary focus of the community. In this work, we hope to draw the community’s attention to the latter, which is arguably equally problematic for real applications.
We show that a surprisingly straightforward and general approach, can predict the likely accuracy (or failure) of a variety of computer vision systems – semantic segmentation, vanishing point and camera parameter estimation, and image memorability prediction – on individual input images. We also explore attribute prediction, where classifiers are typically meant to generalize to new unseen categories. Finally, we the approach ALERT to improve the performance of a downstream application of attribute prediction: zero-shot learning. We show that ALERT can outperform several strong baselines for zero-shot learning on four datasets.
This is a joint work with P. Zhang, J. Wang, A Farhadi, and D. Parikh.***