-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HRViT #1730
Comments
Hi, thanks for your issue. It is really a great work, we have noticed it but due to lack of developers, we do not have a clear time schedule to support it. If you are willing to support it, PR is always welcome and we would review it as soon as possible because PRs from community are high priority for our repo. Best, |
Sure, I will try in the following days. Probably will need some help on the way, but hopefully we will manage to do it :). Best, |
OK, feel free to contact us when you meet any problems in your PR. Best, |
Hi, I have the first version of the implementation, but I am unsure whether it works or how to test it properly, what kind of tests should be written. Should I create a pull request and continue the discussion there? Is it needed to refer to this issue in that pull request somehow? Thanks for all the help, I am looking forward to your feedback. Best regards, |
Hi @lorinczszabolcs, sorry for the late reply. |
Hi @xiexinch ! Ok, I will create a pull request soon. Unfortunately the authors didn't provide pretrained weights for now. Would you have the resources to train the networks from scratch and evaluate them that way, maybe also providing pretrained weights after that? |
Describe the feature
Add the model described in "Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation" which is a new vision transformer backbone design for semantic segmentation. It has a multi-branch high-resolution (HR) architecture with enhanced multi-scale representability, surpassing state-of-the-art MiT and CSWin backbones with an average of +1.78 mIoU improvement, 28% parameter saving, and 21% FLOPs reduction on ADE20K and Cityscapes.
Motivation
Recent model that combines the features of HRNet and ViT, achieving good performance while reducing parameters and FLOPs.
Related resources
Official code can be found here.
Additional context
Their implementation already uses mmseg and mmcv, so should be quite straightforward to add support for it.
The text was updated successfully, but these errors were encountered: