Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiry about the Fig-6 #4

Open
yuhaoliu7456 opened this issue Dec 8, 2023 · 9 comments
Open

Inquiry about the Fig-6 #4

yuhaoliu7456 opened this issue Dec 8, 2023 · 9 comments

Comments

@yuhaoliu7456
Copy link

Does anyone know how to generate the visual results in Figure 6? I see that they extract SSL representations from image samples, and the authors don't seem to describe how they combine these features with randomly generated noise in the RDM.

@LTH14
Copy link
Owner

LTH14 commented Dec 8, 2023

Thanks for your interest. For Figure 6 we don't add noise to the extracted representation -- the SSL representation extracted from the pre-trained encoder is directly fed into the pixel generator to generate the images. In the "GT Representation Reconstruction" section of this Jupyter notebook, we provide code for this functionality. If you are interested in how to add random noise during training and unconditional generation, you can check the DDPM and DDIM code here.

@yuhaoliu7456
Copy link
Author

Thanks for your interest. For Figure 6 we don't add noise to the extracted representation -- the SSL representation extracted from the pre-trained encoder is directly fed into the pixel generator to generate the images. In the "GT Representation Reconstruction" section of this Jupyter notebook, we provide code for this functionality. If you are interested in how to add random noise during training and unconditional generation, you can check the DDPM and DDIM code here.

Thanks for your reply.

@mapengsen
Copy link

I'm sorry, I don't quite understand what you mean. Did you input GT image into MAGE, and then SSL(GT image) is used as the condition of MAGE, and then do it by changing random seeds constantly?
Thanks a lot.
@LTH14

@mapengsen
Copy link

Can you tell me a little bit about how Figure 7 is done?
Since I see that RCG has only one condition input, how can I interpolate between the two images? Thank you very much for your reply!

image

@LTH14
Copy link
Owner

LTH14 commented Dec 23, 2023

@mapengsen Thanks for your interest. For Figure 6, we extract representation from GT image and generate image pixels conditioned on this representation. You can refer to the provided visualization notebook for more implementation details. For Figure 7, please refer to this issue #20.

@mapengsen
Copy link

Thank you very much! I've understood.

@whisper-11
Copy link

Thank you for your exceptional work! Could you please clarify if the Representation Reconstruction function depicted in Figure 6 also applies to images that are not part of the ImageNet dataset?
Thank you very much! @LTH14

@LTH14
Copy link
Owner

LTH14 commented Dec 27, 2023

Thanks for your interest! The provided Moco v3 and MAGE checkpoints are both trained on ImageNet. Therefore, it should give reasonable results on natural images that are not contained in ImageNet. However, if the image is too far away from ImageNet, the reconstruction performance can be bad.

@whisper-11
Copy link

Thanks for your interest! The provided Moco v3 and MAGE checkpoints are both trained on ImageNet. Therefore, it should give reasonable results on natural images that are not contained in ImageNet. However, if the image is too far away from ImageNet, the reconstruction performance can be bad.

Thanks for your reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants