Update EVA-01 & EVA-02 benchmark (#280)

* refine README * update deformable-detr weights * update deformable-detr weights * update readme * refine README * refine README * refine README * update file name * update readme * update readme * update 1536 vitdet-b eva * update weight links
IDEA-Research · Jul 16, 2023 · e23ebb1 · e23ebb1
1 parent 64135ca
commit e23ebb1
Show file tree

Hide file tree

Showing 5 changed files with 44 additions and 7 deletions.
diff --git a/projects/deformable_detr/README.md b/projects/deformable_detr/README.md
@@ -21,6 +21,14 @@ Here we provide the pretrained `Deformable-DETR` weights based on detrex.
 <th valign="bottom">box<br/>AP</th>
 <th valign="bottom">download</th>
 <!-- TABLE BODY -->
+<!-- ROW: deformable_detr_r50_50ep -->
+ <tr><td align="left"><a href="configs/deformable_detr_r50_50ep.py">Deformable-DETR-R50</a></td>
+<td align="center">R-50</td>
+<td align="center">IN1k</td>
+<td align="center">50</td>
+<td align="center">44.9</td>
+<td align="center"> <a href="https://github.com/IDEA-Research/detrex-storage/releases/download/v0.4.0/deformable_detr_r50_50ep_backbone_1e-5_class_weight_2.0.pth">model</a></td>
+</tr>
 <!-- ROW: deformable_detr_r50_with_box_refinement_50ep -->
  <tr><td align="left"><a href="configs/deformable_detr_r50_with_box_refinement_50ep.py">Deformable-DETR-R50 + Box-Refinement</a></td>
 <td align="center">R-50</td>

diff --git a/projects/dino/README.md b/projects/dino/README.md
@@ -468,7 +468,7 @@ python projects/dino/train_net.py --config-file /path/to/config.py --num-gpus 8
 <td align="center">12</td>
 <td align="center">100</td>
 <td align="center">59.1</td>
-<td align="center"> <a href="https://huggingface.co/IDEA-CVR/detrex/resolve/main/dino_eva_01_o365_finetune_detr_like_augmentation_4scale_12ep.pth">huggingface</a></td>
+<td align="center"> <a href="https://huggingface.co/IDEA-CVR/DINO-EVA/resolve/main/dino_eva_01_o365_finetune_detr_like_augmentation_4scale_12ep.pth">huggingface</a></td>
 </tr>
 </tbody></table>
 

diff --git a/projects/dino_eva/README.md b/projects/dino_eva/README.md
@@ -7,32 +7,59 @@ We implement [DINO](https://arxiv.org/abs/2203.03605) with [EVA](https://github.
 
 ## Table of Contents
 - [Pretrained Models](#pretrained-models)
+  - [EVA-01](#eva-01)
+  - [EVA-02](#eva-02)
 - [Training](#training)
 - [Evaluation](#evaluation)
 - [Citation](#citing-dino-and-eva)
 
 ## Pretrained Models
+Here's the model card for `dino-eva` models, all the pretrained weights can be downloaded in [Huggingface](https://huggingface.co/IDEA-CVR/DINO-EVA/tree/main)
 
+### EVA-01
+
+<div align="center">
+
+| Name | init. model weight | LSJ crop size | epoch | AP box | config | download |
+|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
+| `dino-eva-01` | `eva_o365` | `1280x1280` | 12 | 62.1 | [config](./configs/dino-eva-01/dino_eva_01_1280_4scale_12ep.py) | [Huggingface](https://huggingface.co/IDEA-CVR/DINO-EVA/resolve/main/dino_eva_01_o365_finetune_1280_lsj_augmentation_4scale_12ep.pth) |
+
+</div>
+
+- All the `dino-eva-01` models were trained using the original [tools/train_net.py](https://github.com/IDEA-Research/detrex/blob/main/tools/train_net.py) which set `1e-4` learning rate for backbone.
+- We also release the pretrained `dino-eva-01` model using `DETR-like augmentation` in the original [DINO project](https://github.com/IDEA-Research/detrex/tree/main/projects/dino#pretrained-dino-with-eva-backbone)
+
+### EVA-02
 <div align="center">
 
 | Name | init. model weight | LSJ crop size | epoch | AP box | config | download |
 |:---:|:---:|:---:|:---:|:---:|:---:|:---:|
-| `dino-eva-02` | `eva02_L_m38m_to_o365` | `1536x1536` | 12 | 61.6 | [config](./configs/dino-eva-02/dino_eva_02_vitdet_l_8attn_1536_lrd0p8_4scale_12ep.py) | [Huggingface](https://huggingface.co/IDEA-CVR/detrex/resolve/main/dino_eva_02_o365_backbone_finetune_vitdet_l_8attn_lsj_1536_4scale_12ep.pth) |
-| `dino-eva-02` | `eva02_L_pt_m38m_p14to16` | `1024x1024` | 12 | 58.9 | [config](./configs/dino-eva-02/dino_eva_02_vitdet_b_4attn_1024_lrd0p7_4scale_12ep.py) | [Huggingface](https://huggingface.co/IDEA-CVR/detrex/resolve/main/dino_eva_02_m38m_pretrain_vitdet_l_4attn_1024_lrd0p8_4scale_12ep.pth) |
+| `dino-eva-02-B` | `eva02_B_pt_in21k_p14to16` | `1024x1024` | 12 | 55.8 | [config](./configs/dino-eva-02/dino_eva_02_vitdet_b_4attn_1024_lrd0p7_4scale_12ep.py) | [Huggingface](https://huggingface.co/IDEA-CVR/DINO-EVA/resolve/main/dino_eva_02_in21k_pretrain_vitdet_b_4attn_1024_lrd0p7_4scale_12ep.pth) |
+| `dino-eva-02-B` | `eva02_B_pt_in21k_p14to16` | `1536x1536` | 12 | 58.1 | [config](./configs/dino-eva-02/dino_eva_02_vitdet_b_6attn_win32_1536_lrd0p7_4scale_12ep.py) | [Huggingface](https://huggingface.co/IDEA-CVR/DINO-EVA/resolve/main/dino_eva_02_in21k_pretrain_vitdet_b_6attn_win32_1536_lrd0p7_4scale_12ep.pth) |
+| `dino-eva-02-L` | `eva02_L_pt_m38m_p14to16` | `1024x1024` | 12 | 58.9 | [config](./configs/dino-eva-02/dino_eva_02_vitdet_l_4attn_1024_lrd0p8_4scale_12ep.py) | [Huggingface](https://huggingface.co/IDEA-CVR/DINO-EVA/resolve/main/dino_eva_02_m38m_pretrain_vitdet_l_4attn_1024_lrd0p8_4scale_12ep.pth) |
+| `dino-eva-02-L` | `eva02_L_m38m_to_o365` | `1536x1536` | 12 | 61.6 | [config](./configs/dino-eva-02/dino_eva_02_vitdet_l_8attn_1536_lrd0p8_4scale_12ep.py) | [Huggingface](https://huggingface.co/IDEA-CVR/DINO-EVA/resolve/main/dino_eva_02_o365_backbone_finetune_vitdet_l_8attn_lsj_1536_4scale_12ep.pth) |
 
 </div>
 
 - For `o365` pretrained EVA model we only load its backbone weights.
 - All the pretrained EVA weights can be downloaded from [here](https://github.com/baaivision/EVA).
-- All `1x settings (12 epochs)` models were trained using the hacked [train_net.py](./train_net.py)
+- `EVA-02-L` models were trained by the hacked [train_net.py](./train_net.py) which used `2e-4` learning rate for backbone.
+- `EVA-02-B` models were trained by the original [tools/train_net.py](https://github.com/IDEA-Research/detrex/blob/main/tools/train_net.py) which used `1e-4` learning for backbone, we've observed it is more stable.
 
 ## Training
-For `1x settings (12 epochs)`, we trained them using the hacked [train_net.py](./train_net.py), here's the training scripts:
+For `EVA-02-L` models, we trained them using the hacked [train_net.py](./train_net.py), here's the training scripts:
 ```bash
 cd detrex
 python projects/dino_eva/train_net.py --config-file projects/dino_eva/configs/path/to/config.py --num-gpus 8
 ```
 
+For `EVA-02-B` models, we trained them using the original [train_net.py](https://github.com/IDEA-Research/detrex/blob/main/tools/train_net.py):
+
+```bash
+cd detrex
+python tools/train_net.py --config-file projects/dino_eva/configs/path/to/config.py train.init_checkpoint=/path/to/model_checkpoint
+```
+
 All configs can be trained with:
 ```bash
 cd detrex
@@ -60,11 +87,13 @@ If you find our work helpful for your research, please consider citing the follo
       archivePrefix={arXiv},
       primaryClass={cs.CV}
 }
+```
 
+```BibTex
 @article{EVA,
   title={EVA: Exploring the Limits of Masked Visual Representation Learning at Scale},
   author={Fang, Yuxin and Wang, Wen and Xie, Binhui and Sun, Quan and Wu, Ledell and Wang, Xinggang and Huang, Tiejun and Wang, Xinlong and Cao, Yue},
   journal={arXiv preprint arXiv:2211.07636},
   year={2022}
 }
-```
+```
diff --git a/...gs/dino-eva-01/dino_eva_01_4scale_12ep.py → ...no-eva-01/dino_eva_01_1280_4scale_12ep.py b/...gs/dino-eva-01/dino_eva_01_4scale_12ep.py → ...no-eva-01/dino_eva_01_1280_4scale_12ep.py
diff --git a/.../dino_eva/configs/dino-eva-02/dino_eva_02_vitdet_b_6attn_win32_1536_lrd0p7_4scale_12ep.py b/.../dino_eva/configs/dino-eva-02/dino_eva_02_vitdet_b_6attn_win32_1536_lrd0p7_4scale_12ep.py
@@ -20,7 +20,7 @@
 model.backbone.net.depth = 12
 model.backbone.net.num_heads = 12
 model.backbone.net.mlp_ratio = 4*2/3
-model.backbone.net.use_act_checkpoint = False
+model.backbone.net.use_act_checkpoint = True
 model.backbone.net.drop_path_rate = 0.1