-
Notifications
You must be signed in to change notification settings - Fork 640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The bad generations of generate.py
.
#120
Comments
The training datasets have to be drastically increased to get decent results - have a look a few issues underneath on the results of afiaka87: #86 (comment) |
I see. Maybe I didn't describe the issue clearly. My problem is more about the mismatched results between images generated during training and by |
@louis2889184 I have noticed similar behavior with reconstructions (even ones directly from the training set) tend to be quite blurry and abstract. I'm out of my depth on that - but I assume it is due to the transformer being forced to recreate the phrase verbatim while allowing image output to deviate a bit. Hopefully someone more knowledgeable than I can chime in on why this happens. As for the "mismatch" between images generated: a.) it wont be deterministic. Run generation many times and see if you can find one or two that are similar. |
@afiaka87 Thanks for your help. I have three different texts corresponding to the same image, and will randomly pick one when loading the data. I'll try to make the image and text one to one, and see if the mismatch will go away. |
@lucidrains I haven't yet reproduced this exact issue, but I have discussed it extensively on the discord and am at least having trouble understanding why @louis2889184 generations are in fact decidedly worse than the reconstructions from training even though they're providing the exact same text. Can you clarify on why that might happen? To be clear - I've only used the generate code a few times. |
@louis2889184 let me know here once you have tried re-running with lower depth/heads and reversible turned off. |
@afiaka87 It turns out it's because the approach we save images is different in In However, in
Hence, there will be lots of pixels, which are original smaller than 0, become 0. That's why there is a big part of the image is black. There are some results, the config I use is
And the outputs of After I add Looks much better. BTW, the Edit The results are quite weird, and I cannot even align the image with the text. We can see some beds and lamps in the images, so the quality is higher than which without masks. It seems that the |
This is great work and an obvious opportunity to submit a pull request if you'd like. @louis2889184 |
A simple inline edit with the github ui should make this a fairly easy fix. |
The PR is made! Thanks, @afiaka87. |
Thanks for the repo.
I trained the DALLE on visual genome dataset. During the training, one of the generations is shown below,
But when I want to generate an image by
generate.py
, the generated images are non-sense, even though I use the text also appeared in the training phase.The scripts I use
and
The results of both scripts are similar, and the generations are
I have checked the model weights are loaded normally. Any thoughts on this issue?
The text was updated successfully, but these errors were encountered: