-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why enhanced_scores is multiplied to hidden_states #22
Comments
thx for your question. Yes, the enhance score calculated in Enhance Block is a scalar value that is used to multiply by the output of temporal attention block. Later, the enhanced temporal attention output will be added to the hidden_states in the form of residual connection in DiT. More details can be found in our blog: https://oahzxl.github.io/Enhance_A_Video/ |
Sry, I do not notice the part of adding enhanced temporal attention output to hidden_states by residual connection in your code. Enhance-A-Video/enhance_a_video/models/cogvideox.py Lines 144 to 145 in f9c31be
In models/cogvideox.py line 145, hidden_states is directly multiplied by enhance_scores, with the latter computed by multiplying the average of temporal attention map (w.o. diagonal, which is a scalar) by a preset enhance_weight. Enhance-A-Video/enhance_a_video/enhance.py Lines 23 to 30 in f9c31be
Could you please point me out the exact snippets of code regarding the residual connection part? |
Hi, thanks for your awesome work! I have some questions about the implementation details.
It seems enhanced_scores is a scalar value regarding the temporal consistency of current layer. So hidden_states with higher temporal consistency is enhanced as a whole? I'm not sure whether it's reasonable to do this.
However, in your blog post, the output from enhance module (and temporal attention) is added to hidden_states, which is somewhat confusing.
Is my understanding correct? Cound u provide more details about the implementation and maybe some ablation results?
The text was updated successfully, but these errors were encountered: