You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks to your marvelous work on open-vocabulary segmentation, I'm very interested in this project. However, I am confused about the setting of inductive open-vocabulary segmentation. Especially, in "A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future", the definition of "inductive" is "training images do not contain any unseen objects even if they are unannotated", which means both pixels and text of unseen objects is forbidden during training. But in this work, "inductive" means "the names of unseen classes in inference are unavailable while training
" and I can't find the corresponding code that sieves off the unseen pixels. So, I get a little confused. To summarize my question into an example: if "Human" is defined as a seen class and "Dog" is an unseen class, then whether an image containing a man and a dog can be used for training?
Thanks in advance and hope for your reply!
The text was updated successfully, but these errors were encountered:
cuttle-fish-my
changed the title
Specification request about the definintion of inductive settings
Specification request about the definintion of inductive setting
Sep 20, 2023
Thanks to your marvelous work on open-vocabulary segmentation, I'm very interested in this project. However, I am confused about the setting of inductive open-vocabulary segmentation. Especially, in "A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future", the definition of "inductive" is "training images do not contain any unseen objects even if they are unannotated", which means both pixels and text of unseen objects is forbidden during training. But in this work, "inductive" means "the names of unseen classes in inference are unavailable while training
" and I can't find the corresponding code that sieves off the unseen pixels. So, I get a little confused. To summarize my question into an example: if "Human" is defined as a seen class and "Dog" is an unseen class, then whether an image containing a man and a dog can be used for training?
Thanks in advance and hope for your reply!
The text was updated successfully, but these errors were encountered: