VISSL Model Zoo and Benchmarks

VISSL provides reference implementation of a large number of self-supervision approaches and also a suite of benchmark tasks to quickly evaluate the representation quality of models trained with these self-supervised tasks using standard evaluation setup. In this document, we list the collection of self-supervised models and benchmark of these models on a standard task of evaluating a linear classifier on ImageNet-1K. All the models can be downloaded from the provided links.

Table of Contents

Torchvision and VISSL
- Converting VISSL to Torchvision
- Converting Torchvision to VISSL
Models
- Supervised
- Semi-weakly and Semi-supervised
- Jigsaw
- Colorization
- RotNet
- DeepCluster
- ClusterFit
- NPID
- NPID++
- PIRL
- SimCLR
- DeepClusterV2
- SwAV
- MoCoV2
- Barlow Twins
- DINO

Torchvision and VISSL

VISSL is 100% compatible with TorchVision ResNet models. It's easy to use torchvision models in VISSL and to use VISSL models in torchvision.

Converting VISSL to Torchvision

All the ResNe(X)t models in VISSL can be converted to Torchvision weights. This involves simply removing the _features_blocks. prefix from all the weights. VISSL provides a convenience script for this:

python extra_scripts/convert_vissl_to_torchvision.py \
    --model_url_or_file <input_model>.pth  \
    --output_dir /path/to/output/dir/ \
    --output_name <my_converted_model>.torch

Converting Torchvision to VISSL

All the ResNe(X)t models in Torchvision can be directly loaded in VISSL. This involves simply setting the REMOVE_PREFIX, APPEND_PREFIX options in the config file following the instructions here. Also, see the example below for how torchvision models are loaded.

Models

VISSL is 100% compatible with TorchVision ResNet models. You can benchmark these models using VISSL's benchmark suite. See the docs for how to run various benchmarks.

Supervised

To reproduce the numbers below, the experiment configuration is provided in json format for each model here.

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
Supervised	RN50 - Torchvision	ImageNet	76.1	model
Supervised	RN101 - Torchvision	ImageNet	77.21	model
Supervised	RN50 - Caffe2	ImageNet	75.88	model
Supervised	RN50 - Caffe2	Places205	58.49	model
Supervised	Alexnet BVLC - Caffe2	ImageNet	49.54	model
Supervised	RN50 - VISSL - 105 epochs	ImageNet	75.45	model

Semi-weakly and Semi-supervised

To reproduce the numbers below, the experiment configuration is provided in json format for each model here.

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
Semi-supervised	RN50	YFCC100M - ImageNet	79.2	model
Semi-weakly supervised	RN50	Public Instagram Images - ImageNet	81.06	model

Jigsaw

To reproduce the numbers below, the experiment configuration is provided in json format for each model here.

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
Jigsaw	RN50 - 100 permutations	ImageNet-1K	48.57	model
Jigsaw	RN50 - 2K permutations	ImageNet-1K	46.73	model
Jigsaw	RN50 - 10K permutations	ImageNet-1K	48.11	model
Jigsaw	RN50 - 2K permutations	ImageNet-22K	44.84	model
Jigsaw	RN50 - Goyal'19	ImageNet-1K	46.58	model
Jigsaw	RN50 - Goyal'19	ImageNet-22K	53.09	model
Jigsaw	RN50 - Goyal'19	YFCC100M	51.37	model
Jigsaw	AlexNet - Goyal'19	ImageNet-1K	34.82	model
Jigsaw	AlexNet - Goyal'19	ImageNet-22K	37.5	model
Jigsaw	AlexNet - Goyal'19	YFCC100M	37.01	model

Colorization

To reproduce the numbers below, the experiment configuration is provided in json format for each model here.

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
Colorization	RN50 - Goyal'19	ImageNet-1K	40.11	model
Colorization	RN50 - Goyal'19	ImageNet-22K	49.24	model
Colorization	RN50 - Goyal'19	YFCC100M	47.46	model
Colorization	AlexNet - Goyal'19	ImageNet-1K	30.39	model
Colorization	AlexNet - Goyal'19	ImageNet-22K	36.83	model
Colorization	AlexNet - Goyal'19	YFCC100M	34.19	model

RotNet

To reproduce the numbers below, the experiment configuration is provided in json format for each model here.

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
RotNet	AlexNet official	ImageNet-1K	39.51	model
RotNet	RN50 - 105 epochs	ImageNet-1K	48.2	model
RotNet	RN50 - 105 epochs	ImageNet-22K	54.89	model

DeepCluster

To reproduce the numbers below, the experiment configuration is provided in json format for each model here.

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
DeepCluster	AlexNet official	ImageNet-1K	37.88	model

ClusterFit

To reproduce the numbers below, the experiment configuration is provided in json format for each model here.

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
ClusterFit	RN50 - 105 epochs - 16K clusters from RotNet	ImageNet-1K	53.63	model

NPID

To reproduce the numbers below, the experiment configuration is provided in json format for each model here.

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
NPID	RN50 official oldies	ImageNet-1K	54.99	model
NPID	RN50 - 4k negatives - 200 epochs - VISSL	ImageNet-1K	52.73	model

NPID++

To reproduce the numbers below, the experiment configuration is provided in json format for each model here.

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
NPID++	RN50 - 32k negatives - 800 epochs - cosine LR	ImageNet-1K	56.68	model
NPID++	RN50-w2 - 32k negatives - 800 epochs - cosine LR	ImageNet-1K	62.73	model

PIRL

To reproduce the numbers below, the experiment configuration is provided in json format for each model here.

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
PIRL	RN50 - 200 epochs	ImageNet-1K	62.55	model
PIRL	RN50 - 800 epochs	ImageNet-1K	64.29	model

NOTE: Please see projects/PIRL/README.md for more PIRL models provided by authors.

SimCLR

To reproduce the numbers below, the experiment configuration is provided in json format for each model here.

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
SimCLR	RN50 - 100 epochs	ImageNet-1K	64.4	model
SimCLR	RN50 - 200 epochs	ImageNet-1K	66.61	model
SimCLR	RN50 - 400 epochs	ImageNet-1K	67.71	model
SimCLR	RN50 - 800 epochs	ImageNet-1K	69.68	model
SimCLR	RN50 - 1000 epochs	ImageNet-1K	68.8	model
SimCLR	RN50-w2 - 100 epochs	ImageNet-1K	69.82	model
SimCLR	RN50-w2 - 1000 epochs	ImageNet-1K	73.84	model
SimCLR	RN50-w4 - 1000 epochs	ImageNet-1K	71.61	model
SimCLR	RN101 - 100 epochs	ImageNet-1K	62.76	model
SimCLR	RN101 - 1000 epochs	ImageNet-1K	71.56	model

DeepClusterV2

To reproduce the numbers below, the experiment configuration is provided in json format for each model here.

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
DeepClusterV2	RN50 - 400 epochs - 2x224	ImageNet-1K	70.01	model
DeepClusterV2	RN50 - 400 epochs - 2x160+4x96	ImageNet-1K	74.32	model
DeepClusterV2	RN50 - 800 epochs - 2x224+6x96	ImageNet-1K	75.18	model

SwAV

To reproduce the numbers below, the experiment configuration is provided in json format for each model here.

There is some standard deviation in linear results if we run the same eval several times and pre-train a SwAV model several times. The evals reported below are for 1 run.

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
SwAV	RN50 - 100 epochs - 2x224+6x96 - 4096 batch-size	ImageNet-1K	71.99	model
SwAV	RN50 - 200 epochs - 2x224+6x96 - 4096 batch-size	ImageNet-1K	73.85	model
SwAV	RN50 - 400 epochs - 2x224+6x96 - 4096 batch-size	ImageNet-1K	74.81	model
SwAV	RN50 - 800 epochs - 2x224+6x96 - 4096 batch-size	ImageNet-1K	74.92	model
SwAV	RN50 - 200 epochs - 2x224+6x96 - 256 batch-size	ImageNet-1K	73.07	model
SwAV	RN50 - 400 epochs - 2x224+6x96 - 256 batch-size	ImageNet-1K	74.3	model
SwAV	RN50 - 400 epochs - 2x224 - 4096 batch-size	ImageNet-1K	69.53	model
SwAV	RN50-w2 - 400 epochs - 2x224+6x96 - 4096 batch-size	ImageNet-1K	77.01	model
SwAV	RN50-w4 - 400 epochs - 2x224+6x96 - 2560 batch-size	ImageNet-1K	77.03	model

NOTE: Please see projects/SwAV/README.md for more SwAV models provided by authors.

MoCoV2

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
MoCo-v2	RN50 - 200 epochs - 256 batch-size	ImageNet-1K	66.4	model

BarlowTwins

Method	Model	PreTrain dataset	ImageNet top-1 acc.	URL
Barlow Twins	RN50 - 300 epochs - 2048 batch-size	ImageNet-1K	70.75	model
Barlow Twins	RN50 - 1000 epochs - 2048 batch-size	ImageNet-1K	71.80	model

DINO

The model is obtained with this config.

Method	Model	PreTrain dataset	ImageNet k-NN acc.	URL
DINO	DeiT-S/16 - 300 epochs - 1024 batch-size	ImageNet-1K	73.4	model