Object segmentations and image rankings for 15 ImageNet synsets with strong spurious cues
19k Training Images ♦ 750 quintuply validated Test Images ♦ Extensive Evaluation Suite
Each image has an object segmentation. Also, all training images are ranked based on the strength of the spurious cues present. This allows for the selection of balanced subsets (i.e. where spurious correlations are broken).
With our richly annotated dataset and benchmark, we hope the community can begin to consider new training and evaluation paradigms for faithful image classification under suboptimal data conditions; that is, predicting for the right reasons, even when our data is riddled with spurious cues.
Our benchmark includes ablation, noise-based, and saliency analyses to assess whether models predict because of the object region, or if they rely instead on spurious features. Compared to more typical data (as represented by RIVAL10), Hard ImageNet classification induces far greater spurious feature reliance.
curl -L 'https://app.box.com/index.php?rm=box_download_shared_file&shared_name=ca7qlcfsqlfqul9rzgtuqhb2c6pm62qd&file_id=f_972129165893' -o hardImageNet.zip
unzip hardImageNet.zip
Please cite our paper if Hard ImageNet is of use to you.
@misc{moayeri2022hard,
title = {Hard ImageNet: Segmentations for Objects with Strong Spurious Cues},
author = {Moayeri, Mazda and Singla, Sahil and Feizi, Soheil},
booktitle = {openreview},
month = {June},
year = {2022},
}
Please feel free to contact us for any questions or comments regarding either our paper or the dataset. Our emails are found in the paper PDF.