Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

using ViT backbone with PAWS #26

Open
islam-nassar opened this issue Oct 20, 2021 · 1 comment
Open

using ViT backbone with PAWS #26

islam-nassar opened this issue Oct 20, 2021 · 1 comment

Comments

@islam-nassar
Copy link

Hi Mido,

Thanks for the excellent work and thanks for sharing. I was curious if you have tried using a ViT backbone to test PAWS with a transformer backbone. I was wondering cause your concurrent work (DiNO) and others use ViT so I was hoping you have done that. If not, do you reckon it will be straight forward to do that by adjusting the model in your code or you foresee bigger implications?

Cheers

@MidoAssran
Copy link
Contributor

@islam-nassar

Yes we tried with a ViT backbone! In short it worked out-of-the-box with the following setup (similar to DINO):

  • model: ViT-S/16
  • batch-size: 1024
  • support-set: 6720 (960 classes, 7 imgs/class)
  • temperature: 0.1
  • sharpening: 0.25
  • me-max regularization: true
  • starting LR: 2.0e-4
  • LR: 1.0e-3
  • final LR: 1.0e-6
  • start WD: 0.04
  • final WD*: 0.4
  • projection head (same as RN50, but with GELU activations and the following dimensions): [256, 256, 256]
  • prediction head (same as RN50, but with GELU activations and the following dimensions): [256, 256]
  • optimizer: AdamW

Evaluation: soft NN 10% labels (no fine-tuning):

  • 100 epochs of pre-training: 70.9% top-1 on IN1k
  • 300 epochs of pre-training: 72.3% top-1 on IN1k

*Although you can probably just use a constant WD value, i'm not sure the increasing schedule was that important in this experiment.

Let me know if there's some other information about the setup you need that I forgot to mention!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants