VISSL provides reference implementation of a large number of self-supervision approaches and also a suite of benchmark tasks to quickly evaluate the representation quality of models trained with these self-supervised tasks using standard evaluation setup. In this document, we list the collection of self-supervised models and benchmark of these models on a standard task of evaluating a linear classifier on ImageNet-1K. All the models can be downloaded from the provided links.
VISSL is 100% compatible with TorchVision ResNet models. It's easy to use torchvision models in VISSL and to use VISSL models in torchvision.
All the ResNe(X)t models in VISSL can be converted to Torchvision weights. This involves simply removing the _features_blocks.
prefix from all the weights. VISSL provides a convenience script for this:
python extra_scripts/convert_vissl_to_torchvision.py \
--model_url_or_file <input_model>.pth \
--output_dir /path/to/output/dir/ \
--output_name <my_converted_model>.torch
All the ResNe(X)t models in Torchvision can be directly loaded in VISSL. This involves simply setting the REMOVE_PREFIX
, APPEND_PREFIX
options in the config file following the instructions here. Also, see the example below for how torchvision models are loaded.
VISSL is 100% compatible with TorchVision ResNet models. You can benchmark these models using VISSL's benchmark suite. See the docs for how to run various benchmarks.
To reproduce the numbers below, the experiment configuration is provided in json format for each model here.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
Supervised | RN50 - Torchvision | ImageNet | 76.1 | model |
Supervised | RN101 - Torchvision | ImageNet | 77.21 | model |
Supervised | RN50 - Caffe2 | ImageNet | 75.88 | model |
Supervised | RN50 - Caffe2 | Places205 | 58.49 | model |
Supervised | Alexnet BVLC - Caffe2 | ImageNet | 49.54 | model |
Supervised | RN50 - VISSL - 105 epochs | ImageNet | 75.45 | model |
To reproduce the numbers below, the experiment configuration is provided in json format for each model here.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
Semi-supervised | RN50 | YFCC100M - ImageNet | 79.2 | model |
Semi-weakly supervised | RN50 | Public Instagram Images - ImageNet | 81.06 | model |
To reproduce the numbers below, the experiment configuration is provided in json format for each model here.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
Jigsaw | RN50 - 100 permutations | ImageNet-1K | 48.57 | model |
Jigsaw | RN50 - 2K permutations | ImageNet-1K | 46.73 | model |
Jigsaw | RN50 - 10K permutations | ImageNet-1K | 48.11 | model |
Jigsaw | RN50 - 2K permutations | ImageNet-22K | 44.84 | model |
Jigsaw | RN50 - Goyal'19 | ImageNet-1K | 46.58 | model |
Jigsaw | RN50 - Goyal'19 | ImageNet-22K | 53.09 | model |
Jigsaw | RN50 - Goyal'19 | YFCC100M | 51.37 | model |
Jigsaw | AlexNet - Goyal'19 | ImageNet-1K | 34.82 | model |
Jigsaw | AlexNet - Goyal'19 | ImageNet-22K | 37.5 | model |
Jigsaw | AlexNet - Goyal'19 | YFCC100M | 37.01 | model |
To reproduce the numbers below, the experiment configuration is provided in json format for each model here.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
Colorization | RN50 - Goyal'19 | ImageNet-1K | 40.11 | model |
Colorization | RN50 - Goyal'19 | ImageNet-22K | 49.24 | model |
Colorization | RN50 - Goyal'19 | YFCC100M | 47.46 | model |
Colorization | AlexNet - Goyal'19 | ImageNet-1K | 30.39 | model |
Colorization | AlexNet - Goyal'19 | ImageNet-22K | 36.83 | model |
Colorization | AlexNet - Goyal'19 | YFCC100M | 34.19 | model |
To reproduce the numbers below, the experiment configuration is provided in json format for each model here.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
RotNet | AlexNet official | ImageNet-1K | 39.51 | model |
RotNet | RN50 - 105 epochs | ImageNet-1K | 48.2 | model |
RotNet | RN50 - 105 epochs | ImageNet-22K | 54.89 | model |
To reproduce the numbers below, the experiment configuration is provided in json format for each model here.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
DeepCluster | AlexNet official | ImageNet-1K | 37.88 | model |
To reproduce the numbers below, the experiment configuration is provided in json format for each model here.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
ClusterFit | RN50 - 105 epochs - 16K clusters from RotNet | ImageNet-1K | 53.63 | model |
To reproduce the numbers below, the experiment configuration is provided in json format for each model here.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
NPID | RN50 official oldies | ImageNet-1K | 54.99 | model |
NPID | RN50 - 4k negatives - 200 epochs - VISSL | ImageNet-1K | 52.73 | model |
To reproduce the numbers below, the experiment configuration is provided in json format for each model here.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
NPID++ | RN50 - 32k negatives - 800 epochs - cosine LR | ImageNet-1K | 56.68 | model |
NPID++ | RN50-w2 - 32k negatives - 800 epochs - cosine LR | ImageNet-1K | 62.73 | model |
To reproduce the numbers below, the experiment configuration is provided in json format for each model here.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
PIRL | RN50 - 200 epochs | ImageNet-1K | 62.55 | model |
PIRL | RN50 - 800 epochs | ImageNet-1K | 64.29 | model |
NOTE: Please see projects/PIRL/README.md for more PIRL models provided by authors.
To reproduce the numbers below, the experiment configuration is provided in json format for each model here.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
SimCLR | RN50 - 100 epochs | ImageNet-1K | 64.4 | model |
SimCLR | RN50 - 200 epochs | ImageNet-1K | 66.61 | model |
SimCLR | RN50 - 400 epochs | ImageNet-1K | 67.71 | model |
SimCLR | RN50 - 800 epochs | ImageNet-1K | 69.68 | model |
SimCLR | RN50 - 1000 epochs | ImageNet-1K | 68.8 | model |
SimCLR | RN50-w2 - 100 epochs | ImageNet-1K | 69.82 | model |
SimCLR | RN50-w2 - 1000 epochs | ImageNet-1K | 73.84 | model |
SimCLR | RN50-w4 - 1000 epochs | ImageNet-1K | 71.61 | model |
SimCLR | RN101 - 100 epochs | ImageNet-1K | 62.76 | model |
SimCLR | RN101 - 1000 epochs | ImageNet-1K | 71.56 | model |
To reproduce the numbers below, the experiment configuration is provided in json format for each model here.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
DeepClusterV2 | RN50 - 400 epochs - 2x224 | ImageNet-1K | 70.01 | model |
DeepClusterV2 | RN50 - 400 epochs - 2x160+4x96 | ImageNet-1K | 74.32 | model |
DeepClusterV2 | RN50 - 800 epochs - 2x224+6x96 | ImageNet-1K | 75.18 | model |
To reproduce the numbers below, the experiment configuration is provided in json format for each model here.
There is some standard deviation in linear results if we run the same eval several times and pre-train a SwAV model several times. The evals reported below are for 1 run.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
SwAV | RN50 - 100 epochs - 2x224+6x96 - 4096 batch-size | ImageNet-1K | 71.99 | model |
SwAV | RN50 - 200 epochs - 2x224+6x96 - 4096 batch-size | ImageNet-1K | 73.85 | model |
SwAV | RN50 - 400 epochs - 2x224+6x96 - 4096 batch-size | ImageNet-1K | 74.81 | model |
SwAV | RN50 - 800 epochs - 2x224+6x96 - 4096 batch-size | ImageNet-1K | 74.92 | model |
SwAV | RN50 - 200 epochs - 2x224+6x96 - 256 batch-size | ImageNet-1K | 73.07 | model |
SwAV | RN50 - 400 epochs - 2x224+6x96 - 256 batch-size | ImageNet-1K | 74.3 | model |
SwAV | RN50 - 400 epochs - 2x224 - 4096 batch-size | ImageNet-1K | 69.53 | model |
SwAV | RN50-w2 - 400 epochs - 2x224+6x96 - 4096 batch-size | ImageNet-1K | 77.01 | model |
SwAV | RN50-w4 - 400 epochs - 2x224+6x96 - 2560 batch-size | ImageNet-1K | 77.03 | model |
NOTE: Please see projects/SwAV/README.md for more SwAV models provided by authors.
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
MoCo-v2 | RN50 - 200 epochs - 256 batch-size | ImageNet-1K | 66.4 | model |
Method | Model | PreTrain dataset | ImageNet top-1 acc. | URL |
---|---|---|---|---|
Barlow Twins | RN50 - 300 epochs - 2048 batch-size | ImageNet-1K | 70.75 | model |
Barlow Twins | RN50 - 1000 epochs - 2048 batch-size | ImageNet-1K | 71.80 | model |
The model is obtained with this config.
Method | Model | PreTrain dataset | ImageNet k-NN acc. | URL |
---|---|---|---|---|
DINO | DeiT-S/16 - 300 epochs - 1024 batch-size | ImageNet-1K | 73.4 | model |