modify modules for multialpha #1

akichinguyen · 2021-07-12T20:58:45Z

No description provided.

sergeikotelnikov · 2021-07-14T23:27:19Z

equivariant_attention/modules.py

+        output = {}
+        tr_keys = []
+        for i in self.transform.keys():
+            tr_keys.append(int(eval(i)[0]))


I guess we don't need int() here.

sergeikotelnikov · 2021-07-15T01:25:26Z

Unfortunately, I cannot comment on non-PR lines within this PR
isaacs/github#284

se3-transformer-public/equivariant_attention/modules.py

Lines 391 to 435 in 0f7e307

    
           class GNormBias(nn.Module): 
        
               """Norm-based SE(3)-equivariant nonlinearity with only learned biases.""" 
        
               def __init__(self, fiber, nonlin=nn.ReLU(inplace=True), 
        
                            num_layers: int = 0): 
        
                   """Initializer. 
        
                   Args: 
        
                       fiber: Fiber() of feature multiplicities and types 
        
                       nonlin: nonlinearity to use everywhere 
        
                       num_layers: non-negative number of linear layers in fnc 
        
                   """ 
        
                   super().__init__() 
        
                   self.fiber = fiber 
        
                   self.nonlin = nonlin 
        
                   self.num_layers = num_layers 
        
                   # Regularization for computing phase: gradients explode otherwise 
        
                   self.eps = 1e-12 
        
                   # Norm mappings: 1 per feature type 
        
                   self.bias = nn.ParameterDict() 
        
                   for m, d in self.fiber.structure: 
        
                       self.bias[str(d)] = nn.Parameter(torch.randn(m).view(1, m)) 
        
               def __repr__(self): 
        
                   return f"GNormTFN()" 
        
               def forward(self, features, **kwargs): 
        
                   output = {} 
        
                   for k, v in features.items(): 
        
                       # Compute the norms and normalized features 
        
                       # v shape: [...,m , 2*k+1] 
        
                       norm = v.norm(2, -1, keepdim=True).clamp_min(self.eps).expand_as(v) 
        
                       phase = v / norm 
        
                       # Transform on norms 
        
                       # transformed = self.transform[str(k)](norm[..., 0]).unsqueeze(-1) 
        
                       transformed = self.nonlin(norm[..., 0] + self.bias[str(k)]) 
        
                       # Nonlinearity on norm 
        
                       output[k] = (transformed.unsqueeze(-1) * phase).view(*v.shape) 
        
                   return output

I think we can simplify it:
torch.randn(m).view(1, m) -> torch.randn(1, m, 1)
get rid of:
.expand_as(v)
[..., 0]
.unsqueeze(-1)
.view(*v.shape)
What do you think?

sergeikotelnikov · 2021-07-15T06:44:37Z

equivariant_attention/modules.py

            msg = msg.view(msg.shape[0], -1, 2*d_out+1)

            return {f'out{d_out}': msg.view(msg.shape[0], -1, 2*d_out+1)}


We don't need to do the same tensor reshaping one more time.

sergeikotelnikov · 2021-07-15T07:38:40Z

equivariant_attention/modules.py

+                msg = msg + torch.matmul(edge, src) #sum over all d_in => prob need to keep this separate, not sum up
+            msg = msg.view(msg.shape[0], -1, 2*d_out+1)
+
+            return {f'out{d_out},{dv_in},{dv_out}': msg.view(msg.shape[0], -1, 2*d_out+1)}


We don't need to do the same tensor reshaping one more time.

sergeikotelnikov · 2021-07-15T22:22:25Z

se3-transformer-public/equivariant_attention/modules.py

Lines 938 to 1019 in 0f7e307

    
           class GMABSE3(nn.Module): 
        
               """An SE(3)-equivariant multi-headed self-attention module for DGL graphs.""" 
        
               def __init__(self, f_value: Fiber, f_key: Fiber, n_heads: int): 
        
                   """SE(3)-equivariant MAB (multi-headed attention block) layer. 
        
                   Args: 
        
                       f_value: Fiber() object for value-embeddings 
        
                       f_key: Fiber() object for key-embeddings 
        
                       n_heads: number of heads 
        
                   """ 
        
                   super().__init__() 
        
                   self.f_value = f_value 
        
                   self.f_key = f_key 
        
                   self.n_heads = n_heads 
        
                   self.new_dgl = version.parse(dgl.__version__) > version.parse('0.4.4') 
        
               def __repr__(self): 
        
                   return f'GMABSE3(n_heads={self.n_heads}, structure={self.f_value})' 
        
               def udf_u_mul_e(self, d_out): 
        
                   """Compute the weighted sum for a single output feature type. 
        
                   This function is set up as a User Defined Function in DGL. 
        
                   Args: 
        
                       d_out: output feature type 
        
                   Returns: 
        
                       edge -> node function handle 
        
                   """ 
        
                   def fnc(edges): 
        
                       # Neighbor -> center messages 
        
                       attn = edges.data['a'] 
        
                       value = edges.data[f'v{d_out}'] 
        
                       # Apply attention weights 
        
                       msg = attn.unsqueeze(-1).unsqueeze(-1) * value 
        
                       return {'m': msg} 
        
                   return fnc 
        
               @profile 
        
               def forward(self, v, k: Dict=None, q: Dict=None, G=None, **kwargs): 
        
                   """Forward pass of the linear layer 
        
                   Args: 
        
                       G: minibatch of (homo)graphs 
        
                       v: dict of value edge-features 
        
                       k: dict of key edge-features 
        
                       q: dict of query node-features 
        
                   Returns: 
        
                       tensor with new features [B, n_points, n_features_out] 
        
                   """ 
        
                   with G.local_scope(): 
        
                       # Add node features to local graph scope 
        
                       ## We use the stacked tensor representation for attention 
        
                       for m, d in self.f_value.structure: 
        
                           G.edata[f'v{d}'] = v[f'{d}'].view(-1, self.n_heads, m//self.n_heads, 2*d+1) #keep vector shape for different type 
        
                       G.edata['k'] = fiber2head(k, self.n_heads, self.f_key, squeeze=True) # [edges, heads, channels](?) #concat all types into 1 vector 
        
                       G.ndata['q'] = fiber2head(q, self.n_heads, self.f_key, squeeze=True) # [nodes, heads, channels](?) 
        
                       # Compute attention weights 
        
                       ## Inner product between (key) neighborhood and (query) center 
        
                       G.apply_edges(fn.e_dot_v('k', 'q', 'e')) 
        
                       ## Apply softmax 
        
                       e = G.edata.pop('e') 
        
                       if self.new_dgl: 
        
                           # in dgl 5.3, e has an extra dimension compared to dgl 4.3 
        
                           # the following, we get rid of this be reshaping 
        
                           n_edges = G.edata['k'].shape[0] 
        
                           e = e.view([n_edges, self.n_heads]) 
        
                       e = e / np.sqrt(self.f_key.n_features) 
        
                       G.edata['a'] = edge_softmax(G, e) 
        
                       # Perform attention-weighted message-passing 
        
                       for d in self.f_value.degrees: 
        
                           G.update_all(self.udf_u_mul_e(d), fn.sum('m', f'out{d}')) 
        
                       output = {} 
        
                       for m, d in self.f_value.structure: 
        
                           output[f'{d}'] = G.ndata[f'out{d}'].view(-1, m, 2*d+1) 
        
                       return output

I think the dot products should be divided by np.sqrt(self.f_key.n_features / self.n_heads).
The same thing applies to GMABSE3_qkv.

akichinguyen · 2021-07-16T11:22:36Z

What do you want to divide it by?

On Thu, Jul 15, 2021 at 6:22 PM Sergei Kotelnikov ***@***.***> wrote: https://github.com/akichinguyen/se3-transformer-public/blob/0f7e3072a79b2caf45cbd9b6a1bd811c20bb9215/equivariant_attention/modules.py#L938-L1019 I think the dot product should be divided by — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AF3S77KYUTRC5T4JFMVHWJTTX5NSZANCNFSM5AHTJ5AA> .

-- *Thu T. Nguyen*

sergeikotelnikov · 2021-07-16T17:32:24Z

What do you want to divide it by?
On Thu, Jul 15, 2021 at 6:22 PM Sergei Kotelnikov @.***> wrote:

se3-transformer-public/equivariant_attention/modules.py

Lines 938 to 1019 in 0f7e307

class GMABSE3(nn.Module):

"""An SE(3)-equivariant multi-headed self-attention module for DGL graphs."""

def __init__(self, f_value: Fiber, f_key: Fiber, n_heads: int):

"""SE(3)-equivariant MAB (multi-headed attention block) layer.

Args:

f_value: Fiber() object for value-embeddings

f_key: Fiber() object for key-embeddings

n_heads: number of heads

"""

super().__init__()

self.f_value = f_value

self.f_key = f_key

self.n_heads = n_heads

self.new_dgl = version.parse(dgl.__version__) > version.parse('0.4.4')

def __repr__(self):

return f'GMABSE3(n_heads={self.n_heads}, structure={self.f_value})'

def udf_u_mul_e(self, d_out):

"""Compute the weighted sum for a single output feature type.

This function is set up as a User Defined Function in DGL.

Args:

d_out: output feature type

Returns:

edge -> node function handle

"""

def fnc(edges):

# Neighbor -> center messages

attn = edges.data['a']

value = edges.data[f'v{d_out}']

# Apply attention weights

msg = attn.unsqueeze(-1).unsqueeze(-1) * value

return {'m': msg}

return fnc

@profile

def forward(self, v, k: Dict=None, q: Dict=None, G=None, **kwargs):

"""Forward pass of the linear layer

Args:

G: minibatch of (homo)graphs

v: dict of value edge-features

k: dict of key edge-features

q: dict of query node-features

Returns:

tensor with new features [B, n_points, n_features_out]

"""

with G.local_scope():

# Add node features to local graph scope

## We use the stacked tensor representation for attention

for m, d in self.f_value.structure:

G.edata[f'v{d}'] = v[f'{d}'].view(-1, self.n_heads, m//self.n_heads, 2*d+1) #keep vector shape for different type

G.edata['k'] = fiber2head(k, self.n_heads, self.f_key, squeeze=True) # [edges, heads, channels](?) #concat all types into 1 vector

G.ndata['q'] = fiber2head(q, self.n_heads, self.f_key, squeeze=True) # [nodes, heads, channels](?)

# Compute attention weights

## Inner product between (key) neighborhood and (query) center

G.apply_edges(fn.e_dot_v('k', 'q', 'e'))

## Apply softmax

e = G.edata.pop('e')

if self.new_dgl:

# in dgl 5.3, e has an extra dimension compared to dgl 4.3

# the following, we get rid of this be reshaping

n_edges = G.edata['k'].shape[0]

e = e.view([n_edges, self.n_heads])

e = e / np.sqrt(self.f_key.n_features)

G.edata['a'] = edge_softmax(G, e)

# Perform attention-weighted message-passing

for d in self.f_value.degrees:

G.update_all(self.udf_u_mul_e(d), fn.sum('m', f'out{d}'))

output = {}

for m, d in self.f_value.structure:

output[f'{d}'] = G.ndata[f'out{d}'].view(-1, m, 2*d+1)

return output

I think the dot product should be divided by — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF3S77KYUTRC5T4JFMVHWJTTX5NSZANCNFSM5AHTJ5AA .
-- Thu T. Nguyen

np.sqrt(self.f_key.n_features / self.n_heads)

sergeikotelnikov · 2022-02-13T02:23:19Z

se3-transformer-public/equivariant_attention/modules.py

Lines 46 to 50 in e7960dd

    
           cloned_d = torch.clone(G.edata['d']) 
        
           if G.edata['d'].requires_grad: 
        
               cloned_d.requires_grad_() 
        
               log_gradient_norm(cloned_d, 'Basis computation flow')

se3-transformer-public/equivariant_attention/modules.py

Lines 79 to 83 in e7960dd

    
           cloned_d = torch.clone(G.edata['d']) 
        
           if G.edata['d'].requires_grad: 
        
               cloned_d.requires_grad_() 
        
               log_gradient_norm(cloned_d, 'Neural networks flow')

I think cloned_d.requires_grad_() is redundant.

sergeikotelnikov · 2022-02-15T23:25:06Z

se3-transformer-public/equivariant_attention/modules.py

Lines 576 to 588 in e7960dd

    
           class BN(nn.Module): 
        
               """SE(3)-equvariant batch/layer normalization""" 
        
               def __init__(self, m): 
        
                   """SE(3)-equvariant batch/layer normalization 
        
                   Args: 
        
                       m: int for number of output channels 
        
                   """ 
        
                   super().__init__() 
        
                   self.bn = nn.LayerNorm(m) 
        
               def forward(self, x): 
        
                   return self.bn(x)

I don't understand why they (partially) call it batch normalization and why they need this function. In essence, it is layer normalization.

sergeikotelnikov · 2022-02-17T23:23:47Z

equivariant_attention/modules.py

+    def forward(self, features, **kwargs):
+        output = {}
+        tr_keys = []
+        for i in self.transform.keys():


#self.transform.keys() is equal to #({d_out},{dv_in},{dv_out}). It is not equal to #d_outs. Should we maybe use self.f_out.degrees instead of tr_keys or tr_keys = {}.

I think it's good. The tr_keys only contains d_out via int(eval(i)[0]). Ok I saw degrees. Let me change

I tested. f_out.degrees and tr_keys contain the same elements.

sergeikotelnikov · 2022-02-18T00:03:57Z

equivariant_attention/modules.py

+        self.transform = nn.ParameterDict()
+        for m_out, d_out in self.f_out.structure:
+            for mv_in, dv_in in self.fv_in.structure:
+                for mv_out, dv_out in self.fv_out.structure:


Should we use self.fv_in.degrees and self.fv_out.degrees?

we can change but is it necessary? they are similar to this except we dont need mv_in and mv_out right?

sergeikotelnikov · 2022-02-18T23:35:01Z

se3-transformer-public/equivariant_attention/modules.py

Lines 391 to 406 in e7960dd

    
           class GNormBias(nn.Module): 
        
               """Norm-based SE(3)-equivariant nonlinearity with only learned biases.""" 
        
               def __init__(self, fiber, nonlin=nn.ReLU(inplace=True), 
        
                            num_layers: int = 0): 
        
                   """Initializer. 
        
                   Args: 
        
                       fiber: Fiber() of feature multiplicities and types 
        
                       nonlin: nonlinearity to use everywhere 
        
                       num_layers: non-negative number of linear layers in fnc 
        
                   """ 
        
                   super().__init__() 
        
                   self.fiber = fiber 
        
                   self.nonlin = nonlin 
        
                   self.num_layers = num_layers

We don't use num_layers.

sergeikotelnikov · 2022-02-19T02:29:40Z

se3-transformer-public/equivariant_attention/modules.py

Line 433 in e7960dd

output[k] = (transformed.unsqueeze(-1) * phase).view(*v.shape)

.view(*v.shape) is not necessary.

sergeikotelnikov · 2022-02-24T21:41:21Z

se3-transformer-public/equivariant_attention/modules.py

Lines 465 to 468 in e7960dd

    
           cur_inpt = m_in * m_in 
        
           net = [] 
        
           for i in range(1, self.num_layers): 
        
               net.append(nn.LayerNorm(int(cur_inpt)))

We don't need int() here.

sergeikotelnikov · 2022-02-25T08:21:40Z

se3-transformer-public/equivariant_attention/modules.py

Lines 492 to 494 in e7960dd

    
           sign = scalars.sign() 
        
           scalars = scalars.abs_().clamp_min(self.eps) 
        
           scalars *= sign

Do we need to clamp here?

akichinguyen · 2022-02-28T22:23:40Z

se3-transformer-public/equivariant_attention/modules.py

Line 433 in e7960dd

output[k] = (transformed.unsqueeze(-1) * phase).view(*v.shape)

.view(*v.shape) is not necessary.

agree. they have the same shape

akichinguyen · 2022-02-28T22:26:30Z

se3-transformer-public/equivariant_attention/modules.py

Lines 492 to 494 in e7960dd

sign = scalars.sign()

scalars = scalars.abs_().clamp_min(self.eps)

scalars *= sign

Do we need to clamp here?

No I dont think we need

…transformer-public into se3_multialpha

modify modules for multialpha

a965d28

akichinguyen requested a review from sergeikotelnikov July 12, 2021 20:58

change fiber2head_qkv, separate key components

0f7e307

sergeikotelnikov reviewed Jul 14, 2021

View reviewed changes

sergeikotelnikov reviewed Jul 15, 2021

View reviewed changes

Thu Nguyen added 2 commits August 11, 2021 15:00

change nbody for local run

68eb679

remove charge multiply edata

e7960dd

sergeikotelnikov reviewed Feb 17, 2022

View reviewed changes

sergeikotelnikov reviewed Feb 18, 2022

View reviewed changes

akichinguyen and others added 7 commits March 16, 2022 10:53

Merge branch 'FabianFuchsML:master' into se3_multialpha

b3731f0

add dipole simulation

a1d7952

Merge branch 'FabianFuchsML:master' into se3_multialpha

b0acf43

local change

74d2c0e

return charges for Gw

1cba273

Merge branch 'se3_multialpha' of https://github.com/akichinguyen/se3-…

8c13ac7

…transformer-public into se3_multialpha

dipole multi alpha

c4c74da

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modify modules for multialpha #1

modify modules for multialpha #1

akichinguyen commented Jul 12, 2021

sergeikotelnikov Jul 14, 2021 •

edited

Loading

sergeikotelnikov commented Jul 15, 2021 •

edited

Loading

sergeikotelnikov Jul 15, 2021 •

edited

Loading

sergeikotelnikov Jul 15, 2021

sergeikotelnikov commented Jul 15, 2021 •

edited

Loading

akichinguyen commented Jul 16, 2021 via email

sergeikotelnikov commented Jul 16, 2021 •

edited

Loading

sergeikotelnikov commented Feb 13, 2022 •

edited

Loading

sergeikotelnikov commented Feb 15, 2022

sergeikotelnikov Feb 17, 2022 •

edited

Loading

akichinguyen Feb 28, 2022 •

edited

Loading

akichinguyen Feb 28, 2022

sergeikotelnikov Feb 18, 2022

akichinguyen Feb 28, 2022

sergeikotelnikov commented Feb 18, 2022

sergeikotelnikov commented Feb 19, 2022

sergeikotelnikov commented Feb 24, 2022

sergeikotelnikov commented Feb 25, 2022

akichinguyen commented Feb 28, 2022

akichinguyen commented Feb 28, 2022

		msg = msg.view(msg.shape[0], -1, 2*d_out+1)

		return {f'out{d_out}': msg.view(msg.shape[0], -1, 2*d_out+1)}

modify modules for multialpha #1

Are you sure you want to change the base?

modify modules for multialpha #1

Conversation

akichinguyen commented Jul 12, 2021

sergeikotelnikov Jul 14, 2021 • edited Loading

Choose a reason for hiding this comment

sergeikotelnikov commented Jul 15, 2021 • edited Loading

sergeikotelnikov Jul 15, 2021 • edited Loading

Choose a reason for hiding this comment

sergeikotelnikov Jul 15, 2021

Choose a reason for hiding this comment

sergeikotelnikov commented Jul 15, 2021 • edited Loading

akichinguyen commented Jul 16, 2021 via email

sergeikotelnikov commented Jul 16, 2021 • edited Loading

sergeikotelnikov commented Feb 13, 2022 • edited Loading

sergeikotelnikov commented Feb 15, 2022

sergeikotelnikov Feb 17, 2022 • edited Loading

Choose a reason for hiding this comment

akichinguyen Feb 28, 2022 • edited Loading

Choose a reason for hiding this comment

akichinguyen Feb 28, 2022

Choose a reason for hiding this comment

sergeikotelnikov Feb 18, 2022

Choose a reason for hiding this comment

akichinguyen Feb 28, 2022

Choose a reason for hiding this comment

sergeikotelnikov commented Feb 18, 2022

sergeikotelnikov commented Feb 19, 2022

sergeikotelnikov commented Feb 24, 2022

sergeikotelnikov commented Feb 25, 2022

akichinguyen commented Feb 28, 2022

akichinguyen commented Feb 28, 2022

sergeikotelnikov Jul 14, 2021 •

edited

Loading

sergeikotelnikov commented Jul 15, 2021 •

edited

Loading

sergeikotelnikov Jul 15, 2021 •

edited

Loading

sergeikotelnikov commented Jul 15, 2021 •

edited

Loading

sergeikotelnikov commented Jul 16, 2021 •

edited

Loading

sergeikotelnikov commented Feb 13, 2022 •

edited

Loading

sergeikotelnikov Feb 17, 2022 •

edited

Loading

akichinguyen Feb 28, 2022 •

edited

Loading