This is a PyTorch implementation of Light-Ham combined with the VAN backbone. The code is based on MMSegmentation. More details can be found in Visual Attention Network.
@inproceedings{
ham,
title={Is Attention Better Than Matrix Decomposition?},
author={Zhengyang Geng and Meng-Hao Guo and Hongxu Chen and Xia Li and Ke Wei and Zhouchen Lin},
booktitle={International Conference on Learning Representations},
year={2021},
}
@article{guo2022visual,
title={Visual Attention Network},
author={Guo, Meng-Hao and Lu, Cheng-Ze and Liu, Zheng-Ning and Cheng, Ming-Ming and Hu, Shi-Min},
journal={arXiv preprint arXiv:2202.09741},
year={2022}
}
Notes: Pre-trained models can be found in Visual Attention Network for Classification.
Method | Backbone | Iters | mIoU | Params | FLOPs | Config | Download |
---|---|---|---|---|---|---|---|
Light-Ham-D256 | VAN-Tiny | 160K | 40.9 | 4.2M | 6.5G | config | Google Drive |
Light-Ham | VAN-Tiny | 160K | 42.3 | 4.9M | 11.3G | config | Google Drive |
Light-Ham-D256 | VAN-Small | 160K | 45.2 | 13.8M | 15.8G | config | Google Drive |
Light-Ham | VAN-Small | 160K | 45.7 | 14.7M | 21.4G | config | Google Drive |
Light-Ham | VAN-Base | 160K | 49.6 | 27.4M | 34.4G | config | Google Drive |
Light-Ham | VAN-Large | 160K | 51.0 | 45.6M | 55.0G | config | Google Drive |
Light-Ham | VAN-Huge | 160K | 51.5 | 61.1M | 71.8G | config | Google Drive |
- | - | - | - | - | - | - | - |
Segformer | VAN-Base | 160K | 48.4 | 29.3M | 68.6G | - | - |
Segformer | VAN-Large | 160K | 50.3 | 47.5M | 89.2G | - | - |
- | - | - | - | - | - | - | - |
HamNet | VAN-Tiny-OS8 | 160K | 41.5 | 11.9M | 50.8G | config | Google Drive |
HamNet | VAN-Small-OS8 | 160K | 45.1 | 24.2M | 100.6G | config | Google Drive |
HamNet | VAN-Base-OS8 | 160K | 48.7 | 36.9M | 153.6G | config | Google Drive |
HamNet | VAN-Large-OS8 | 160K | 50.2 | 55.1M | 227.7G | config | Google Drive |
Notes: In this scheme, we use multi-scale validation following Swin-Transformer. FLOPs are tested under the input size of 512 x 512 using torchprofile (recommended, highly accurate and automatic MACs/FLOPs statistics).
Install MMSegmentation and download ADE20K according to the guidelines in MMSegmentation.
We use 8 GPUs for training by default. Run:
bash dist_train.sh /path/to/config 8
To evaluate the model, run:
bash dist_test.sh /path/to/config /path/to/checkpoint_file 8 --out results.pkl --eval mIoU --aug-test
Install torchprofile using
pip install torchprofile
To calculate FLOPs for a model, run:
bash tools/flops.sh /path/to/checkpoint_file --shape 512 512