using ViT backbone with PAWS #26

islam-nassar · 2021-10-20T04:10:34Z

Hi Mido,

Thanks for the excellent work and thanks for sharing. I was curious if you have tried using a ViT backbone to test PAWS with a transformer backbone. I was wondering cause your concurrent work (DiNO) and others use ViT so I was hoping you have done that. If not, do you reckon it will be straight forward to do that by adjusting the model in your code or you foresee bigger implications?

Cheers

MidoAssran · 2021-10-27T15:46:06Z

@islam-nassar

Yes we tried with a ViT backbone! In short it worked out-of-the-box with the following setup (similar to DINO):

model: ViT-S/16
batch-size: 1024
support-set: 6720 (960 classes, 7 imgs/class)
temperature: 0.1
sharpening: 0.25
me-max regularization: true
starting LR: 2.0e-4
LR: 1.0e-3
final LR: 1.0e-6
start WD: 0.04
final WD*: 0.4
projection head (same as RN50, but with GELU activations and the following dimensions): [256, 256, 256]
prediction head (same as RN50, but with GELU activations and the following dimensions): [256, 256]
optimizer: AdamW

Evaluation: soft NN 10% labels (no fine-tuning):

100 epochs of pre-training: 70.9% top-1 on IN1k
300 epochs of pre-training: 72.3% top-1 on IN1k

*Although you can probably just use a constant WD value, i'm not sure the increasing schedule was that important in this experiment.

Let me know if there's some other information about the setup you need that I forgot to mention!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

using ViT backbone with PAWS #26

using ViT backbone with PAWS #26

islam-nassar commented Oct 20, 2021

MidoAssran commented Oct 27, 2021

using ViT backbone with PAWS #26

using ViT backbone with PAWS #26

Comments

islam-nassar commented Oct 20, 2021

MidoAssran commented Oct 27, 2021