Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiry About Training and Inference in RCG Model from Your Recent Publication #17

Open
Yuan1z0825 opened this issue Dec 19, 2023 · 2 comments

Comments

@Yuan1z0825
Copy link

I recently read your fascinating paper titled "Self-conditioned Image Generation via Generating Representations" and have a question regarding the training and inference processes of the RCG model, particularly about the image masking strategy.
In the paper, it's mentioned that during training, the pixel generator is trained with partially masked images. However, during inference, images are fully masked. I am curious about how this difference in masking (partial during training and full during inference) affects the model's performance and its ability to reconstruct images.
Your insights into this aspect of the RCG model would be greatly appreciated, as it would deepen my understanding of your novel approach.

@LTH14
Copy link
Owner

LTH14 commented Dec 19, 2023

Thanks for your interest! During training, the masking ratio is randomly selected from 50%-100%, so it covers both the fully-masked scenario and the partially-masked scenario. We use a multi-step parallel decoding strategy during inference, which means that the image is generated starting from a 100% masked image, and is gradually filled in until all masked tokens are generated. You might refer to the MaskGIT and MAGE paper for more detailed illustrations of the parallel decoding strategy.

@Yuan1z0825
Copy link
Author

Thank you for your thoughtful answers to my questions. I will carefully look into the work on MaskGIT and MAGE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants