Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support ConvNeXt #1216

Merged
merged 36 commits into from
Mar 4, 2022
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
b66ab67
upload original backbone and configs
MengzhangLI Jan 18, 2022
abceb28
ConvNext Refactor
MengzhangLI Jan 19, 2022
6c48587
ConvNext Refactor
MengzhangLI Jan 19, 2022
7260a4f
convnext customization refactor with mmseg style
MengzhangLI Jan 20, 2022
46a529a
convnext customization refactor with mmseg style
MengzhangLI Jan 20, 2022
fd73adc
add ade20k_640x640.py
MengzhangLI Jan 20, 2022
8cb0748
upload files for training
MengzhangLI Jan 21, 2022
e0892c6
delete dist_optimizer_hook and remove layer_decay_optimizer_constructor
MengzhangLI Jan 22, 2022
9289262
check max(out_indices) < num_stages
MengzhangLI Jan 22, 2022
2d998a3
add unittest
MengzhangLI Jan 22, 2022
adcdd2c
fix lint error
MengzhangLI Jan 22, 2022
01c0d93
use MMClassification backbone
MengzhangLI Feb 10, 2022
6a5592a
fix bugs in base_1k
MengzhangLI Feb 13, 2022
bc78936
Merge branch 'master' of https://github.com/open-mmlab/mmsegmentation…
MengzhangLI Feb 14, 2022
54837c2
add mmcls in requirements/mminstall.txt
MengzhangLI Feb 14, 2022
575312b
add mmcls in requirements/mminstall.txt
MengzhangLI Feb 14, 2022
f9d4068
Merge branch 'master' of https://github.com/open-mmlab/mmsegmentation…
MengzhangLI Feb 16, 2022
bb4f35a
fix drop_path_rate and layer_scale_init_value
MengzhangLI Feb 24, 2022
a7efb3b
use logger.info instead of print
MengzhangLI Feb 24, 2022
2358388
add mmcls in runtime.txt
MengzhangLI Feb 24, 2022
a931ac4
fix f string && delete
MengzhangLI Feb 24, 2022
1cdf2a2
add doctring in LearningRateDecayOptimizerConstructor and fix mmcls v…
MengzhangLI Feb 25, 2022
75ea6b6
fix typo in LearningRateDecayOptimizerConstructor
MengzhangLI Feb 25, 2022
482bff2
use ConvNext models in unit test for LearningRateDecayOptimizerConstr…
MengzhangLI Mar 1, 2022
e8db7fe
add unit test
MengzhangLI Mar 2, 2022
b424568
fix typo
MengzhangLI Mar 2, 2022
d2a3ea4
fix typo
MengzhangLI Mar 2, 2022
e671ee3
add layer_wise and fix redundant backbone.downsample_norm in it
MengzhangLI Mar 2, 2022
755dc7e
fix unit test
MengzhangLI Mar 3, 2022
09be63b
give a ground truth lr_scale and weight_decay
MengzhangLI Mar 3, 2022
3c8436a
upload models and readme
MengzhangLI Mar 3, 2022
614c352
delete 'backbone.stem_norm' and 'backbone.downsample_norm' in get_num…
MengzhangLI Mar 4, 2022
c39ff5f
fix unit test and use mmcls url
MengzhangLI Mar 4, 2022
dd6e669
Merge branch 'master' of https://github.com/open-mmlab/mmsegmentation…
MengzhangLI Mar 4, 2022
2845ba7
update md2yml.py and metafile
MengzhangLI Mar 4, 2022
b77b971
fix typo
MengzhangLI Mar 4, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .dev/md2yml.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,8 +151,9 @@ def parse_md(md_file):
model_name = fn[:-3]
fps = els[fps_id] if els[fps_id] != '-' and els[
fps_id] != '' else -1
mem = els[mem_id] if els[mem_id] != '-' and els[
mem_id] != '' else -1
mem = els[mem_id].split(
'\\'
)[0] if els[mem_id] != '-' and els[mem_id] != '' else -1
crop_size = els[crop_size_id].split('x')
assert len(crop_size) == 2
model = {
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,7 @@ Supported methods:
- [x] [DPT (ArXiv'2021)](configs/dpt)
- [x] [Segmenter (ICCV'2021)](configs/segmenter)
- [x] [SegFormer (NeurIPS'2021)](configs/segformer)
- [x] [ConvNeXt (ArXiv'2022)](configs/convnext)
MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved

Supported datasets:

Expand Down
1 change: 1 addition & 0 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,7 @@ MMSegmentation 是一个基于 PyTorch 的语义分割开源工具箱。它是 O
- [x] [DPT (ArXiv'2021)](configs/dpt)
- [x] [Segmenter (ICCV'2021)](configs/segmenter)
- [x] [SegFormer (NeurIPS'2021)](configs/segformer)
- [x] [ConvNeXt (ArXiv'2022)](configs/convnext)
MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved

已支持的数据集:

Expand Down
54 changes: 54 additions & 0 deletions configs/_base_/datasets/ade20k_640x640.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# dataset settings
dataset_type = 'ADE20KDataset'
data_root = 'data/ade/ADEChallengeData2016'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (640, 640)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', reduce_zero_label=True),
dict(type='Resize', img_scale=(2560, 640), ratio_range=(0.5, 2.0)),
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
dict(type='RandomFlip', prob=0.5),
dict(type='PhotoMetricDistortion'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_semantic_seg']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(2560, 640),
# img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75],
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
samples_per_gpu=4,
workers_per_gpu=4,
train=dict(
type=dataset_type,
data_root=data_root,
img_dir='images/training',
ann_dir='annotations/training',
pipeline=train_pipeline),
val=dict(
type=dataset_type,
data_root=data_root,
img_dir='images/validation',
ann_dir='annotations/validation',
pipeline=test_pipeline),
test=dict(
type=dataset_type,
data_root=data_root,
img_dir='images/validation',
ann_dir='annotations/validation',
pipeline=test_pipeline))
44 changes: 44 additions & 0 deletions configs/_base_/models/upernet_convnext.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
norm_cfg = dict(type='SyncBN', requires_grad=True)
custom_imports = dict(imports='mmcls.models', allow_failed_imports=False)
checkpoint_file = './pretrain/convnext-base_3rdparty_32xb128-noema_in1k_20220301-2a0ee547.pth' # noqa
MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved
model = dict(
type='EncoderDecoder',
pretrained=None,
backbone=dict(
type='mmcls.ConvNeXt',
arch='base',
out_indices=[0, 1, 2, 3],
drop_path_rate=0.4,
layer_scale_init_value=1.0,
gap_before_final_norm=False,
init_cfg=dict(
type='Pretrained', checkpoint=checkpoint_file,
prefix='backbone.')),
decode_head=dict(
type='UPerHead',
in_channels=[128, 256, 512, 1024],
in_index=[0, 1, 2, 3],
pool_scales=(1, 2, 3, 6),
channels=512,
dropout_ratio=0.1,
num_classes=19,
norm_cfg=norm_cfg,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
auxiliary_head=dict(
type='FCNHead',
in_channels=384,
in_index=2,
channels=256,
num_convs=1,
concat_input=False,
dropout_ratio=0.1,
num_classes=19,
norm_cfg=norm_cfg,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)),
# model training and testing settings
train_cfg=dict(),
test_cfg=dict(mode='whole'))
71 changes: 71 additions & 0 deletions configs/convnext/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# ConvNeXt

[A ConvNet for the 2020s](https://arxiv.org/abs/2201.03545)

## Introduction

<!-- [BACKBONE] -->

<a href="https://github.com/facebookresearch/ConvNeXt">Official Repo</a>

<a href="https://github.com/open-mmlab/mmclassification/blob/v0.20.1/mmcls/models/backbones/convnext.py#L133">Code Snippet</a>

## Abstract

<!-- [ABSTRACT] -->

The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model. A vanilla ViT, on the other hand, faces difficulties when applied to general computer vision tasks such as object detection and semantic segmentation. It is the hierarchical Transformers (e.g., Swin Transformers) that reintroduced several ConvNet priors, making Transformers practically viable as a generic vision backbone and demonstrating remarkable performance on a wide variety of vision tasks. However, the effectiveness of such hybrid approaches is still largely credited to the intrinsic superiority of Transformers, rather than the inherent inductive biases of convolutions. In this work, we reexamine the design spaces and test the limits of what a pure ConvNet can achieve. We gradually "modernize" a standard ResNet toward the design of a vision Transformer, and discover several key components that contribute to the performance difference along the way. The outcome of this exploration is a family of pure ConvNet models dubbed ConvNeXt. Constructed entirely from standard ConvNet modules, ConvNeXts compete favorably with Transformers in terms of accuracy and scalability, achieving 87.8% ImageNet top-1 accuracy and outperforming Swin Transformers on COCO detection and ADE20K segmentation, while maintaining the simplicity and efficiency of standard ConvNets.

<!-- [IMAGE] -->
<div align=center>
<img src="https://user-images.githubusercontent.com/8370623/148624004-e9581042-ea4d-4e10-b3bd-42c92b02053b.png" width="90%"/>
</div>

```bibtex
@article{liu2022convnet,
title={A ConvNet for the 2020s},
author={Liu, Zhuang and Mao, Hanzi and Wu, Chao-Yuan and Feichtenhofer, Christoph and Darrell, Trevor and Xie, Saining},
journal={arXiv preprint arXiv:2201.03545},
year={2022}
}
```

### Usage

- This backbone need to install [MMClassification](https://github.com/open-mmlab/mmclassification) first, which has abundant backbones for downstream tasks.

```shell
pip install mmcls>=0.20.1
```

### Pre-trained Models

The pre-trained models on ImageNet-1k or ImageNet-21k are used to fine-tune on the downstream tasks.

| Model | Training Data | Params(M) | Flops(G) | Download |
|:--------------:|:-------------:|:---------:|:--------:|:--------:|
| ConvNeXt-T\* | ImageNet-1k | 28.59 | 4.46 | [model](https://download.openmmlab.com/mmclassification/v0/convnext/downstream/convnext-tiny_3rdparty_32xb128-noema_in1k_20220301-795e9634.pth) |
| ConvNeXt-S\* | ImageNet-1k | 50.22 | 8.69 | [model](https://download.openmmlab.com/mmclassification/v0/convnext/downstream/convnext-small_3rdparty_32xb128-noema_in1k_20220301-303e75e3.pth) |
| ConvNeXt-B\* | ImageNet-1k | 88.59 | 15.36 | [model](https://download.openmmlab.com/mmclassification/v0/convnext/downstream/convnext-base_3rdparty_32xb128-noema_in1k_20220301-2a0ee547.pth) |
| ConvNeXt-B\* | ImageNet-21k | 88.59 | 15.36 | [model](https://download.openmmlab.com/mmclassification/v0/convnext/downstream/convnext-base_3rdparty_in21k_20220301-262fd037.pth) |
| ConvNeXt-L\* | ImageNet-21k | 197.77 | 34.37 | [model](https://download.openmmlab.com/mmclassification/v0/convnext/downstream/convnext-large_3rdparty_in21k_20220301-e6e0ea0a.pth) |
| ConvNeXt-XL\* | ImageNet-21k | 350.20 | 60.93 | [model](https://download.openmmlab.com/mmclassification/v0/convnext/downstream/convnext-xlarge_3rdparty_in21k_20220301-08aa5ddc.pth) |

*Models with \* are converted from the [official repo](https://github.com/facebookresearch/ConvNeXt/tree/main/semantic_segmentation#results-and-fine-tuned-models).*

## Results and models

### ADE20K

| Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | mIoU | mIoU(ms+flip) | config | download |
| ------ | -------- | --------- | ---------- | ------- | -------- | --- | --- | -------------- | ----- |
| UperNet | ConvNeXt-T | 512x512 | 160000 | 4.23 | 19.90 | 46.11 | 46.62 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/convnext/upernet_convnext_tiny_fp16_512x512_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_tiny_fp16_512x512_160k_ade20k/upernet_convnext_tiny_fp16_512x512_160k_ade20k_20220227_124553-cad485de.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_tiny_fp16_512x512_160k_ade20k/upernet_convnext_tiny_fp16_512x512_160k_ade20k_20220227_124553.log.json) |
| UperNet | ConvNeXt-S | 512x512 | 160000 | 5.16 | 15.18 | 48.56 | 49.02 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/convnext/upernet_convnext_small_fp16_512x512_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_small_fp16_512x512_160k_ade20k/upernet_convnext_small_fp16_512x512_160k_ade20k_20220227_131208-1b1e394f.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_small_fp16_512x512_160k_ade20k/upernet_convnext_small_fp16_512x512_160k_ade20k_20220227_131208.log.json) |
| UperNet | ConvNeXt-B | 512x512 | 160000 | 6.33 | 14.41 | 48.71 | 49.54 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/convnext/upernet_convnext_base_fp16_512x512_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_base_fp16_512x512_160k_ade20k/upernet_convnext_base_fp16_512x512_160k_ade20k_20220227_181227-02a24fc6.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_base_fp16_512x512_160k_ade20k/upernet_convnext_base_fp16_512x512_160k_ade20k_20220227_181227.log.json) |
| UperNet | ConvNeXt-B |640x640 | 160000 | 8.53 | 10.88 | 52.13 | 52.66 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/convnext/upernet_convnext_base_fp16_640x640_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_base_fp16_640x640_160k_ade20k/upernet_convnext_base_fp16_640x640_160k_ade20k_20220227_182859-9280e39b.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_base_fp16_640x640_160k_ade20k/upernet_convnext_base_fp16_640x640_160k_ade20k_20220227_182859.log.json) |
| UperNet | ConvNeXt-L |640x640 | 160000 | 12.08 | 7.69 | 53.16 | 53.38 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/convnext/upernet_convnext_large_fp16_640x640_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_large_fp16_640x640_160k_ade20k/upernet_convnext_large_fp16_640x640_160k_ade20k_20220226_040532-e57aa54d.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_large_fp16_640x640_160k_ade20k/upernet_convnext_large_fp16_640x640_160k_ade20k_20220226_040532.log.json) |
| UperNet | ConvNeXt-XL |640x640 | 160000 | 26.16\* | 6.33 | 53.58 | 54.11 | [config](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/convnext/upernet_convnext_xlarge_fp16_640x640_160k_ade20k.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_xlarge_fp16_640x640_160k_ade20k/upernet_convnext_xlarge_fp16_640x640_160k_ade20k_20220226_080344-95fc38c2.pth) &#124; [log](https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_xlarge_fp16_640x640_160k_ade20k/upernet_convnext_xlarge_fp16_640x640_160k_ade20k_20220226_080344.log.json) |

Note:

- `Mem (GB)` with \* is collected when `cudnn_benchmark=True`.
MengzhangLI marked this conversation as resolved.
Show resolved Hide resolved
149 changes: 149 additions & 0 deletions configs/convnext/convnext.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
Collections:
- Name: convnext
Metadata:
Training Data:
- Usage
- Pre-trained Models
- ADE20K
Paper:
URL: https://arxiv.org/abs/2201.03545
Title: A ConvNet for the 2020s
README: configs/convnext/README.md
Code:
URL: https://github.com/open-mmlab/mmclassification/blob/v0.20.1/mmcls/models/backbones/convnext.py#L133
Version: v0.20.1
Converted From:
Code: https://github.com/facebookresearch/ConvNeXt
Models:
- Name: upernet_convnext_tiny_fp16_512x512_160k_ade20k
In Collection: convnext
Metadata:
backbone: ConvNeXt-T
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 50.25
hardware: V100
backend: PyTorch
batch size: 1
mode: FP16
resolution: (512,512)
Training Memory (GB): 4.23
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 46.11
mIoU(ms+flip): 46.62
Config: configs/convnext/upernet_convnext_tiny_fp16_512x512_160k_ade20k.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_tiny_fp16_512x512_160k_ade20k/upernet_convnext_tiny_fp16_512x512_160k_ade20k_20220227_124553-cad485de.pth
- Name: upernet_convnext_small_fp16_512x512_160k_ade20k
In Collection: convnext
Metadata:
backbone: ConvNeXt-S
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 65.88
hardware: V100
backend: PyTorch
batch size: 1
mode: FP16
resolution: (512,512)
Training Memory (GB): 5.16
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 48.56
mIoU(ms+flip): 49.02
Config: configs/convnext/upernet_convnext_small_fp16_512x512_160k_ade20k.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_small_fp16_512x512_160k_ade20k/upernet_convnext_small_fp16_512x512_160k_ade20k_20220227_131208-1b1e394f.pth
- Name: upernet_convnext_base_fp16_512x512_160k_ade20k
In Collection: convnext
Metadata:
backbone: ConvNeXt-B
crop size: (512,512)
lr schd: 160000
inference time (ms/im):
- value: 69.4
hardware: V100
backend: PyTorch
batch size: 1
mode: FP16
resolution: (512,512)
Training Memory (GB): 6.33
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 48.71
mIoU(ms+flip): 49.54
Config: configs/convnext/upernet_convnext_base_fp16_512x512_160k_ade20k.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_base_fp16_512x512_160k_ade20k/upernet_convnext_base_fp16_512x512_160k_ade20k_20220227_181227-02a24fc6.pth
- Name: upernet_convnext_base_fp16_640x640_160k_ade20k
In Collection: convnext
Metadata:
backbone: ConvNeXt-B
crop size: (640,640)
lr schd: 160000
inference time (ms/im):
- value: 91.91
hardware: V100
backend: PyTorch
batch size: 1
mode: FP16
resolution: (640,640)
Training Memory (GB): 8.53
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 52.13
mIoU(ms+flip): 52.66
Config: configs/convnext/upernet_convnext_base_fp16_640x640_160k_ade20k.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_base_fp16_640x640_160k_ade20k/upernet_convnext_base_fp16_640x640_160k_ade20k_20220227_182859-9280e39b.pth
- Name: upernet_convnext_large_fp16_640x640_160k_ade20k
In Collection: convnext
Metadata:
backbone: ConvNeXt-L
crop size: (640,640)
lr schd: 160000
inference time (ms/im):
- value: 130.04
hardware: V100
backend: PyTorch
batch size: 1
mode: FP16
resolution: (640,640)
Training Memory (GB): 12.08
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 53.16
mIoU(ms+flip): 53.38
Config: configs/convnext/upernet_convnext_large_fp16_640x640_160k_ade20k.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_large_fp16_640x640_160k_ade20k/upernet_convnext_large_fp16_640x640_160k_ade20k_20220226_040532-e57aa54d.pth
- Name: upernet_convnext_xlarge_fp16_640x640_160k_ade20k
In Collection: convnext
Metadata:
backbone: ConvNeXt-XL
crop size: (640,640)
lr schd: 160000
inference time (ms/im):
- value: 157.98
hardware: V100
backend: PyTorch
batch size: 1
mode: FP16
resolution: (640,640)
Training Memory (GB): 26.16
Results:
- Task: Semantic Segmentation
Dataset: ADE20K
Metrics:
mIoU: 53.58
mIoU(ms+flip): 54.11
Config: configs/convnext/upernet_convnext_xlarge_fp16_640x640_160k_ade20k.py
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/convnext/upernet_convnext_xlarge_fp16_640x640_160k_ade20k/upernet_convnext_xlarge_fp16_640x640_160k_ade20k_20220226_080344-95fc38c2.pth
40 changes: 40 additions & 0 deletions configs/convnext/upernet_convnext_base_fp16_512x512_160k_ade20k.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
_base_ = [
'../_base_/models/upernet_convnext.py', '../_base_/datasets/ade20k.py',
'../_base_/default_runtime.py', '../_base_/schedules/schedule_160k.py'
]
crop_size = (512, 512)
model = dict(
decode_head=dict(in_channels=[128, 256, 512, 1024], num_classes=150),
auxiliary_head=dict(in_channels=512, num_classes=150),
test_cfg=dict(mode='slide', crop_size=crop_size, stride=(341, 341)),
)

optimizer = dict(
constructor='LearningRateDecayOptimizerConstructor',
_delete_=True,
type='AdamW',
lr=0.0001,
betas=(0.9, 0.999),
weight_decay=0.05,
paramwise_cfg={
'decay_rate': 0.9,
'decay_type': 'stage_wise',
'num_layers': 12
})

lr_config = dict(
_delete_=True,
policy='poly',
warmup='linear',
warmup_iters=1500,
warmup_ratio=1e-6,
power=1.0,
min_lr=0.0,
by_epoch=False)

# By default, models are trained on 8 GPUs with 2 images per GPU
data = dict(samples_per_gpu=2)
# fp16 settings
optimizer_config = dict(type='Fp16OptimizerHook', loss_scale='dynamic')
# fp16 placeholder
fp16 = dict()
Loading