You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the excellent work. I am a bit confusing about the Fig 3(b). In the fig, the original image and the representation are sent to the pixel generator. I am just wondering if it is ok to exclude the original image (just the representation).
The text was updated successfully, but these errors were encountered:
Thanks for your interest. Please note that Fig 3(b) is to illustrate the pixel generator's training phase. Most current generative frameworks, such as MAGE and LDM, either partially mask or add noise to the original image, and ask the model to reconstruct the original image during training. In FIg 3(b), we take MAGE as an example, which first tokenizes the image into image tokens and then masks some of the tokens. Therefore, the original image is needed as the input of the training phase. However, we do not need the original image during generation -- generation starts from a 100% masked image (MAGE), or Gaussian noise (LDM/ADM), conditioned on only the representation.
Thanks for the excellent work. I am a bit confusing about the Fig 3(b). In the fig, the original image and the representation are sent to the pixel generator. I am just wondering if it is ok to exclude the original image (just the representation).
The text was updated successfully, but these errors were encountered: