Add HRViT #1730

lorinczszabolcs · 2022-07-01T11:10:07Z

Describe the feature

Add the model described in "Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation" which is a new vision transformer backbone design for semantic segmentation. It has a multi-branch high-resolution (HR) architecture with enhanced multi-scale representability, surpassing state-of-the-art MiT and CSWin backbones with an average of +1.78 mIoU improvement, 28% parameter saving, and 21% FLOPs reduction on ADE20K and Cityscapes.

Motivation

Recent model that combines the features of HRNet and ViT, achieving good performance while reducing parameters and FLOPs.

Related resources

Official code can be found here.

Additional context
Their implementation already uses mmseg and mmcv, so should be quite straightforward to add support for it.

MengzhangLI · 2022-07-01T12:33:01Z

Hi, thanks for your issue. It is really a great work, we have noticed it but due to lack of developers, we do not have a clear time schedule to support it.

If you are willing to support it, PR is always welcome and we would review it as soon as possible because PRs from community are high priority for our repo.

Best,

lorinczszabolcs · 2022-07-01T13:18:49Z

Sure, I will try in the following days. Probably will need some help on the way, but hopefully we will manage to do it :).

Best,
Szabi

MengzhangLI · 2022-07-01T17:28:54Z

OK, feel free to contact us when you meet any problems in your PR.

Best,

lorinczszabolcs · 2022-07-02T15:37:17Z

Hi,

I have the first version of the implementation, but I am unsure whether it works or how to test it properly, what kind of tests should be written. Should I create a pull request and continue the discussion there? Is it needed to refer to this issue in that pull request somehow? Thanks for all the help, I am looking forward to your feedback.

Best regards,
Szabi

xiexinch · 2022-07-04T07:45:13Z

Hi @lorinczszabolcs, sorry for the late reply.
When you have finished your draft version, might create a pull request to this repository and attach this issue, we'll review it ASAP.
You might test your code by loading the weights provided by the author, in generally, it might need to convert the keys of weights. Then you might run an evaluation, if the results match the results on paper, that means your code is correct.

lorinczszabolcs · 2022-07-04T07:48:44Z

Hi @xiexinch !

Ok, I will create a pull request soon.

Unfortunately the authors didn't provide pretrained weights for now. Would you have the resources to train the networks from scratch and evaluate them that way, maybe also providing pretrained weights after that?

mm-assistant bot assigned xiexinch Jul 1, 2022

xiexinch added Planned feature Community help wanted Algorithm Improvement or addition of new algorithm model labels Jul 4, 2022

lorinczszabolcs linked a pull request Jul 4, 2022 that will close this issue

[Feature] Add HRViT (CVPR'2022) #1736

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HRViT #1730

Add HRViT #1730

lorinczszabolcs commented Jul 1, 2022

MengzhangLI commented Jul 1, 2022

lorinczszabolcs commented Jul 1, 2022

MengzhangLI commented Jul 1, 2022

lorinczszabolcs commented Jul 2, 2022

xiexinch commented Jul 4, 2022

lorinczszabolcs commented Jul 4, 2022 •

edited

Loading

Add HRViT #1730

Add HRViT #1730

Comments

lorinczszabolcs commented Jul 1, 2022

Describe the feature

MengzhangLI commented Jul 1, 2022

lorinczszabolcs commented Jul 1, 2022

MengzhangLI commented Jul 1, 2022

lorinczszabolcs commented Jul 2, 2022

xiexinch commented Jul 4, 2022

lorinczszabolcs commented Jul 4, 2022 • edited Loading

lorinczszabolcs commented Jul 4, 2022 •

edited

Loading