Skip to content

Latest commit

 

History

History

RSPC_RVT

Training and Evaluation of RSPC-RVT on ImageNet

Improving Robustness of Vision Transformers by Reducing Sensitivity to Patch Corruptions,
Yong Guo, David Stutz, and Bernt Schiele. CVPR 2023.

Dependencies

Our code is built based on pytorch and timm library. Please check the detailed dependencies in requirements.txt.

Dataset Preparation

Please download the clean ImageNet dataset and ImageNet-C dataset and structure the datasets as follows:

/PATH/TO/IMAGENET-C/
  clean/
    class1/
      img3.jpeg
    class2/
      img4.jpeg
  corruption1/
    severity1/
      class1/
        img3.jpeg
      class2/
        img4.jpeg
    severity2/
      class1/
        img3.jpeg
      class2/
        img4.jpeg

We also use other robustness benchmarks for evaluation, including ImageNet-A and ImageNet-P.

Results and Pre-trained Models of RSPC-RVT

Model IN-1K $\uparrow$ IN-C $\downarrow$ IN-A $\uparrow$ IN-P $\downarrow$ #Params Download
RSPC-RVT-Ti 79.5 55.7 16.5 38.0 10.9M model
RSPC-RVT-S 82.2 48.4 27.9 34.3 23.3M model
RSPC-RVT-B 82.8 45.7 32.1 31.0 91.8M model

Evaluation

Evaluate RSPC-RVT-Ti on ImageNet (and optionally on ImageNet-C):

CUDA_VISIBLE_DEVICES=0 python main.py --eval --model rvt_tiny_plus \
    --data-set IMNET --data-path /PATH/TO/IMAGENET --output_dir ../experiments/test_rspc_rvt_tiny_imagenet \
    --pretrain_path ../pretrained/rspc_rvt_tiny.pth.tar --inc_path /PATH/TO/IMAGENET-C

Please see the scripts of evaluating more models in test_pretrained.sh.

Training

Train RSPC-RVT-Ti on ImageNet (using 4 nodes and each with 4 GPUs)

python -m torch.distributed.launch --nproc_per_node=4 --nnodes=4 --node_rank=$NODE_RANK \
    --master_addr=$MASTER_ADDR --master_port=$MASTER_PORT main.py --model rvt_tiny_plus_afat \
    --data-path /PATH/TO/IMAGENET --output_dir ../experiments/exp_rspc_rvt_tiny_imagenet \
    --batch-size 128 --dist-eval --use_patch_aug

Please see the scripts of training more models in train_script.sh.

Citation

If you find this repository helpful, please consider citing:

@inproceedings{guo2023improving,
  title={Improving robustness of vision transformers by reducing sensitivity to patch corruptions},
  author={Guo, Yong and Stutz, David and Schiele, Bernt},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4108--4118},
  year={2023}
}