-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussing PWC Section #1
Comments
I have the same confusion as you. It seems that this work is avoiding the confusion with the unsupervised learning setup, because it's claimed as zero-shot adaptation. But, the experiment show comparison with unsupervised segmentation method. |
Hi both, Thank you for your input. First, to clarify a couple of points raised by @hq-deng's comment:
@mhamilton723, we did not consider our work to be weakly supervised because we are not training for segmentation on images with class labels (in the same way that PiCIE does not refer to itself as weakly supervised). On the other hand, we recognise that there is a spectrum of supervision from zero supervision up to fully supervised, and that by using CLIP we are not at the "zero" end of the spectrum. As such, a precise name would be useful to avoid confusion. In response to your question, we thought that perhaps "Unsupervised Semantic Segmentation with Language-Image Pre-training" could be a better fit for the task setting considered by ReCo (and the DenseCLIP baseline we consider in our paper). If this name seems appropriate to you both (feedback is highly welcome - it would be good for us to get the right name), we will create a branch for the task in Papers with code. Gyungin |
Hello @noelshin , Thanks for your comprehensive answer. That is a novel and interesting concept of segmentation. Although this approach is difficult to define, you are bravely exploring it. Congratulations on your groundbreaking work. |
Hey @noelshin thanks for the detailed reply. I think it might be a good idea to split this leaderboard out for one that uses supervised pre-training as you suggested. In some sense text labels provide even more supervision than classes or tags which is why i originally suggested weakly supervised methods. Thanks for being flexible and understanding on this topic :) |
Hello, congrats on the release of your fantastic work. I love the fact that you can use language to prompt the segmentation, and we appreciate you citing and comparing against STEGO!
Wanted to quickly reach out with regards to how you want to collectively manage the Papers with code section on unsupervised segmentation. Because CLIP is trained with image-language pairs and you use this to generate the attention maps, I think this might fall under weakly supervised methods such as either of these:
https://paperswithcode.com/task/weakly-supervised-object-localization
https://paperswithcode.com/task/weakly-supervised-semantic-segmentation
let me know what you think about this proposal and I'm happy to discuss it further. Congrats again on making your work public!
Best,
Mark
The text was updated successfully, but these errors were encountered: