Skip to content

Commit

Permalink
[Feature] Support ViT backbone and release DINO new baselines (#138)
Browse files Browse the repository at this point in the history
* add 5scale r50 dino

* refine model zoo

* add swin-large 36ep dino config

* add dino_swin_large_5scale_36ep results

* add dino_vitdet_large config

* refine vitdet config

* add dino-swin-5scale-12ep results

* fix demo bug

* refine h-detr config

* refine ViTDet DINO config

* refine vitdet dino config

* add 100ep vitdet-dino and refine readme

* add vitdet-dino large 12ep

* refine weight links

* add vitdet-large-dino results

* delete useless config

* add vitdet large dino links

* add warmup scheduler for vitdet-large

* refine configs

* update README

* refine docs

Co-authored-by: ntianhe ren <[email protected]>
  • Loading branch information
rentainhe and ntianhe ren authored Dec 3, 2022
1 parent 43dbe84 commit ef300d9
Show file tree
Hide file tree
Showing 15 changed files with 515 additions and 19 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ detrex is an open-source toolbox that provides state-of-the-art Transformer-base

- **Modular Design.** detrex decomposes the Transformer-based detection framework into various components which help users easily build their own customized models.

- **State-of-the-art Methods.** detrex provides a series of Transformer-based detection algorithms, including [DINO](https://arxiv.org/abs/2203.03605) which reached the SOTA of DETR-like models with **63.3mAP**!
- **State-of-the-art Methods.** detrex provides a series of Transformer-based detection algorithms, including [DINO](https://arxiv.org/abs/2203.03605) which reached the SOTA of DETR-like models with **63.3AP**!

- **Easy to Use.** detrex is designed to be **light-weight** and easy for users to use:
- [LazyConfig System](https://detectron2.readthedocs.io/en/latest/tutorials/lazyconfigs.html) for more flexible syntax and cleaner config files.
Expand Down
9 changes: 9 additions & 0 deletions configs/common/data/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
constants = dict(
imagenet_rgb256_mean=[123.675, 116.28, 103.53],
imagenet_rgb256_std=[58.395, 57.12, 57.375],
imagenet_bgr256_mean=[103.530, 116.280, 123.675],
# When using pre-trained models in Detectron1 or any MSRA models,
# std has been absorbed into its conv1 weights, so the std needs to be set 1.
# Otherwise, you can use [57.375, 57.120, 58.395] (ImageNet std)
imagenet_bgr256_std=[1.0, 1.0, 1.0],
)
1 change: 0 additions & 1 deletion demo/demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
from detectron2.utils.logger import setup_logger



# constants
WINDOW_NAME = "COCO detections"

Expand Down
77 changes: 77 additions & 0 deletions docs/source/tutorials/Download_Pretrained_Weights.md
Original file line number Diff line number Diff line change
Expand Up @@ -284,3 +284,80 @@ model.backbone = L(SwinTransformer)(
train.init_checkpoint = "/path/to/swin_large_patch4_window12_384_22kto1k.pth"
```
</details>


## ViTDet
Here we borrowed the download links from the [official implementation](https://github.com/facebookresearch/mae#fine-tuning-with-pre-trained-checkpoints) of MAE.

<table class="docutils"><tbody>
<!-- START TABLE -->
<!-- TABLE HEADER -->
<th valign="bottom"></th>
<th valign="bottom">ViT-Base</th>
<th valign="bottom">ViT-Large</th>
<th valign="bottom">ViT-Huge</th>
<tr><td align="left"> Pretrained Checkpoint </td>
<td align="center"> <a href="https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_base.pth">download</a> </td>
<td align="center"> <a href="https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_large.pth">download</a> </td>
<td align="center"> <a href="https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_huge.pth">download</a> </td>
</tr>
</tbody></table>

<details open>
<summary> <b> Using ViTDet Backbone in Config </b> </summary>

```python
import torch.nn as nn
from detectron2.config import LazyCall as L
from detectron2.layers import ShapeSpec
from detectron2.modeling import ViT, SimpleFeaturePyramid
from detectron2.modeling.backbone.fpn import LastLevelMaxPool

from .dino_r50 import model


# ViT Base Hyper-params
embed_dim, depth, num_heads, dp = 768, 12, 12, 0.1

# Creates Simple Feature Pyramid from ViT backbone
model.backbone = L(SimpleFeaturePyramid)(
net=L(ViT)( # Single-scale ViT backbone
img_size=1024,
patch_size=16,
embed_dim=embed_dim,
depth=depth,
num_heads=num_heads,
drop_path_rate=dp,
window_size=14,
mlp_ratio=4,
qkv_bias=True,
norm_layer=partial(nn.LayerNorm, eps=1e-6),
window_block_indexes=[
# 2, 5, 8 11 for global attention
0,
1,
3,
4,
6,
7,
9,
10,
],
residual_block_indexes=[],
use_rel_pos=True,
out_feature="last_feat",
),
in_feature="${.net.out_feature}",
out_channels=256,
scale_factors=(2.0, 1.0, 0.5), # (4.0, 2.0, 1.0, 0.5) in ViTDet
top_block=L(LastLevelMaxPool)(),
norm="LN",
square_pad=1024,
)

# setup init checkpoint path
train.init_checkpoint = "/path/to/mae_pretrain_vit_base.pth"
```
</details>

Please refer to [DINO](https://github.com/IDEA-Research/detrex/tree/main/projects/dino) project for more details about the usage of vit backbone.
78 changes: 67 additions & 11 deletions docs/source/tutorials/Model_Zoo.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,93 +138,149 @@ Here we provides our pretrained baselines with **detrex**. And more pretrained w
<th valign="bottom">box<br/>AP</th>
<th valign="bottom">Download</th>
<!-- TABLE BODY -->
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_r50_4scale_12ep.py"> DINO-R50-4Scale </a> </td>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_r50_4scale_12ep.py"> DINO-R50-4scale </a> </td>
<td align="center">R50</td>
<td align="center">IN1k</td>
<td align="center">12</td>
<td align="center">100</td>
<td align="center">49.2</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.2.0/dino_r50_4scale_12ep_49_2AP.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_r50_4scale_12ep_300dn.py"> DINO-R50-4Scale </a> </td>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_r50_5scale_12ep.py"> DINO-R50-4scale </a> </td>
<td align="center">R50</td>
<td align="center">IN1k</td>
<td align="center">12</td>
<td align="center">100</td>
<td align="center">49.6</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.2.1/dino_r50_5scale_12ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_r50_4scale_12ep_300dn.py"> DINO-R50-4scale </a> </td>
<td align="center">R50</td>
<td align="center">IN1k</td>
<td align="center">12</td>
<td align="center">300</td>
<td align="center">49.5</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.2.0/dino_r50_4scale_12ep_300dn.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_r50_4scale_24ep.py"> DINO-R50-4Scale </a> </td>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_r50_4scale_24ep.py"> DINO-R50-4scale </a> </td>
<td align="center">R50</td>
<td align="center">IN1k</td>
<td align="center">24</td>
<td align="center">100</td>
<td align="center">50.6</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.1.1/dino_r50_4scale_24ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_r101_4scale_12ep.py"> DINO-R101-4Scale </a> </td>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_r101_4scale_12ep.py"> DINO-R101-4scale </a> </td>
<td align="center">R101</td>
<td align="center">IN1k</td>
<td align="center">12</td>
<td align="center">100</td>
<td align="center">50.0</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.1.1/dino_r101_4scale_12ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_tiny_224_4scale_12ep.py"> DINO-Swin-T-224-4Scale </a> </td>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_tiny_224_4scale_12ep.py"> DINO-Swin-T-224-4scale </a> </td>
<td align="center">Swin-Tiny-224</td>
<td align="center">IN1k</td>
<td align="center">12</td>
<td align="center">100</td>
<td align="center">51.3</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.1.1/dino_swin_tiny_224_4scale_12ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_tiny_224_4scale_12ep.py"> DINO-Swin-T-224-4Scale </a> </td>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_tiny_224_4scale_12ep.py"> DINO-Swin-T-224-4scale </a> </td>
<td align="center">Swin-Tiny-224</td>
<td align="center">IN22k to IN1k</td>
<td align="center">12</td>
<td align="center">100</td>
<td align="center">52.5</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.1.1/dino_swin_tiny_224_22kto1k_finetune_4scale_12ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_small_224_4scale_12ep.py"> DINO-Swin-S-224-4Scale </a> </td>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_small_224_4scale_12ep.py"> DINO-Swin-S-224-4scale </a> </td>
<td align="center">Swin-Small-224</td>
<td align="center">IN1k</td>
<td align="center">12</td>
<td align="center">100</td>
<td align="center">53.0</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.1.1/dino_swin_small_224_4scale_12ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_base_384_4scale_12ep.py"> DINO-Swin-B-384-4Scale </a> </td>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_base_384_4scale_12ep.py"> DINO-Swin-B-384-4scale </a> </td>
<td align="center">Swin-Base-384</td>
<td align="center">IN22k to IN1k</td>
<td align="center">12</td>
<td align="center">100</td>
<td align="center">55.8</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.1.1/dino_swin_base_384_4scale_12ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_large_224_4scale_12ep.py"> DINO-Swin-L-224-4Scale </a> </td>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_large_224_4scale_12ep.py"> DINO-Swin-L-224-4scale </a> </td>
<td align="center">Swin-Large-224</td>
<td align="center">IN22k to IN1k</td>
<td align="center">12</td>
<td align="center">100</td>
<td align="center">56.9</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.1.1/dino_swin_large_224_4scale_12ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_large_384_4scale_12ep.py"> DINO-Swin-L-384-4Scale </a> </td>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_large_384_4scale_12ep.py"> DINO-Swin-L-384-4scale </a> </td>
<td align="center">Swin-Large-384</td>
<td align="center">IN22k to IN1k</td>
<td align="center">12</td>
<td align="center">100</td>
<td align="center">56.9</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.1.1/dino_swin_large_4scale_12ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_large_384_4scale_36ep.py"> DINO-Swin-L-384-4Scale </a> </td>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_large_384_5scale_12ep.py"> DINO-Swin-L-384-5scale </a> </td>
<td align="center">Swin-Large-384</td>
<td align="center">IN22k to IN1k</td>
<td align="center">12</td>
<td align="center">100</td>
<td align="center">57.5</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.2.1/dino_swin_large_384_5scale_12ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_large_384_4scale_36ep.py"> DINO-Swin-L-384-4scale </a> </td>
<td align="center">Swin-Large-384</td>
<td align="center">IN22k to IN1k</td>
<td align="center">36</td>
<td align="center">100</td>
<td align="center">58.1</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.2.0/dino_swin_large_384_4scale_36ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_swin_large_384_5scale_36ep.py"> DINO-Swin-L-384-5scale </a> </td>
<td align="center">Swin-Large-384</td>
<td align="center">IN22k to IN1k</td>
<td align="center">36</td>
<td align="center">100</td>
<td align="center">58.5</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.2.1/dino_swin_large_384_5scale_36ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_vitdet_base_4scale_12ep.py"> DINO-ViTDet-Base-4scale </a> </td>
<td align="center">ViT</td>
<td align="center">IN1k, MAE</td>
<td align="center">12</td>
<td align="center">100</td>
<td align="center">50.2</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.2.1/dino_vitdet_4scale_12ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_vitdet_base_4scale_50ep.py"> DINO-ViTDet-Base-4scale </a> </td>
<td align="center">ViT</td>
<td align="center">IN1k, MAE</td>
<td align="center">50</td>
<td align="center">100</td>
<td align="center">55.0</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.2.1/dino_vitdet_base_4scale_50ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_vitdet_large_4scale_12ep.py"> DINO-ViTDet-Large-4scale </a> </td>
<td align="center">ViT</td>
<td align="center">IN1k, MAE</td>
<td align="center">50</td>
<td align="center">100</td>
<td align="center">52.9</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.2.1/dino_vitdet_large_4scale_12ep.pth"> model </a></td>
</tr>
<tr><td align="left"> <a href="https://github.com/IDEA-Research/detrex/blob/main/projects/dino/configs/dino_vitdet_large_4scale_50ep.py"> DINO-ViTDet-Large-4scale </a> </td>
<td align="center">ViT</td>
<td align="center">IN1k, MAE</td>
<td align="center">50</td>
<td align="center">100</td>
<td align="center">57.5</td>
<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.2.1/dino_vitdet_large_4scale_50ep.pth"> model </a></td>
</tr>
</tbody></table>

Expand Down
Loading

0 comments on commit ef300d9

Please sign in to comment.