-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modify modules for multialpha #1
base: master
Are you sure you want to change the base?
Conversation
output = {} | ||
tr_keys = [] | ||
for i in self.transform.keys(): | ||
tr_keys.append(int(eval(i)[0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we don't need int()
here.
Unfortunately, I cannot comment on non-PR lines within this PR se3-transformer-public/equivariant_attention/modules.py Lines 391 to 435 in 0f7e307
I think we can simplify it: torch.randn(m).view(1, m) -> torch.randn(1, m, 1) get rid of: .expand_as(v) [..., 0] .unsqueeze(-1) .view(*v.shape) What do you think? |
msg = msg.view(msg.shape[0], -1, 2*d_out+1) | ||
|
||
return {f'out{d_out}': msg.view(msg.shape[0], -1, 2*d_out+1)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to do the same tensor reshaping one more time.
msg = msg + torch.matmul(edge, src) #sum over all d_in => prob need to keep this separate, not sum up | ||
msg = msg.view(msg.shape[0], -1, 2*d_out+1) | ||
|
||
return {f'out{d_out},{dv_in},{dv_out}': msg.view(msg.shape[0], -1, 2*d_out+1)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to do the same tensor reshaping one more time.
se3-transformer-public/equivariant_attention/modules.py Lines 938 to 1019 in 0f7e307
I think the dot products should be divided by np.sqrt(self.f_key.n_features / self.n_heads) .The same thing applies to GMABSE3_qkv. |
What do you want to divide it by?
On Thu, Jul 15, 2021 at 6:22 PM Sergei Kotelnikov ***@***.***> wrote:
https://github.com/akichinguyen/se3-transformer-public/blob/0f7e3072a79b2caf45cbd9b6a1bd811c20bb9215/equivariant_attention/modules.py#L938-L1019
I think the dot product should be divided by
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AF3S77KYUTRC5T4JFMVHWJTTX5NSZANCNFSM5AHTJ5AA>
.
--
*Thu T. Nguyen*
|
|
se3-transformer-public/equivariant_attention/modules.py Lines 46 to 50 in e7960dd
se3-transformer-public/equivariant_attention/modules.py Lines 79 to 83 in e7960dd
I think cloned_d.requires_grad_() is redundant.
|
se3-transformer-public/equivariant_attention/modules.py Lines 576 to 588 in e7960dd
I don't understand why they (partially) call it batch normalization and why they need this function. In essence, it is layer normalization. |
def forward(self, features, **kwargs): | ||
output = {} | ||
tr_keys = [] | ||
for i in self.transform.keys(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#self.transform.keys()
is equal to #({d_out},{dv_in},{dv_out})
. It is not equal to #d_out
s. Should we maybe use self.f_out.degrees
instead of tr_keys
or tr_keys = {}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good. The tr_keys only contains d_out via int(eval(i)[0]). Ok I saw degrees. Let me change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested. f_out.degrees and tr_keys contain the same elements.
self.transform = nn.ParameterDict() | ||
for m_out, d_out in self.f_out.structure: | ||
for mv_in, dv_in in self.fv_in.structure: | ||
for mv_out, dv_out in self.fv_out.structure: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use self.fv_in.degrees
and self.fv_out.degrees
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can change but is it necessary? they are similar to this except we dont need mv_in and mv_out right?
se3-transformer-public/equivariant_attention/modules.py Lines 391 to 406 in e7960dd
We don't use num_layers .
|
.view(*v.shape) is not necessary.
|
se3-transformer-public/equivariant_attention/modules.py Lines 465 to 468 in e7960dd
We don't need int() here.
|
se3-transformer-public/equivariant_attention/modules.py Lines 492 to 494 in e7960dd
Do we need to clamp here? |
agree. they have the same shape |
No I dont think we need |
No description provided.