Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DNM] add experiment of training torsion energies #163

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

lilyminium
Copy link
Collaborator

Do not merge: this is only a very partial, bare-bones experiment to see how much work it would be to use NAGL to a) train to non-atom properties (not so bad) and b) incorporate geometric information (more work...) and c) incorporate another trained parameter (quite a lot). My general conclusion is that b) and c) is likely not worth porting to the overall library, although open to anything as usual.

All the maths is very ad-hoc and absolutely needs double-checking and testing, especially the geometry functions. Everything currently is only implemented with the DGL backend.

cc @BenCree -- there's an example notebook here that runs through a very short example of what I think you and Danny were describing to me in our call, if you're interested. Happy to chat more if you want to discuss!

P.S. the Dataset code was a huge pain and very repetitive -- it probably needs refactoring.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@BenCree
Copy link

BenCree commented Nov 25, 2024

This is amazing!! cheers. I'll dive into it now.

@BenCree
Copy link

BenCree commented Dec 10, 2024

Hi Lily, thanks again for doing this - it's been a massive help.

I just wanted to ask a verify/ask a (nooby) question about how the c_ij's are calculated for a given molecule.

My current understanding is that for ComputeSCoefficients() a molecule and a tensor of c_ij values are required (e.g. for ClCCCl there are 9, one for each dihedral).

How exactly are the c_ij values calculated? Is it a unique representation of the dihedral using e.g. all the local environments of the atoms in the dihedral?

Initially I thought maybe _SymmetricPoolingLayer(PoolingLayer)'s _get_pooled_representations was doing it, but that seems to do the whole molecule. Is there a function somewhere that takes 4 indices of a dihedral and generates a c_ij value for it?

This is probably a very obvious question, so apologies!

I've just started validating all the geometry/maths, and otherwise the notebook you provided looks like it's doing exactly what we want. Cheers.

@lilyminium
Copy link
Collaborator Author

Hi Ben, sorry for the delay, I was on leave. You're totally right _SymmetricPoolingLayer(PoolingLayer)'s _get_pooled_representations takes an entire molecule and generates representations for each dihedral term. This representation gets passed through the dense feed-forward readout module layers which learns the c_ij tensor that arrives as input to ComputeSCoefficients. If you wanted to pass in a molecule and get out the c_ij tensor, I think you would need to run something like (totally off-the-cuff and untested):

from openff.nagl.molecule._dgl.molecule import DGLMolecule
 
all_molecule_dihedrals = sorted(molecule.propers) # I think this will be a list of tuples of ints
dihedral_index = all_molecule_dihedrals.index((0, 1, 2, 3))  # where 0, 1, 2, 3 is the atom indices of the dihedral
dglmol = DGLMolecule.from_openff_config(
            molecule,
            model.config,
            model=model,
        )
c_ij_tensor = model._forward_unpostprocessed(dglmol)["alpha"]  # skips the post-processing layer

Is it a unique representation of the dihedral using e.g. all the local environments of the atoms in the dihedral?

Hopefully yes, the representations pooled in _get_pooled_representations pool the representation of each atom in the dihedral post-convolution, so should incorporate the local atom environment.

@lilyminium lilyminium mentioned this pull request Dec 16, 2024
@lilyminium lilyminium added the wontfix This will not be worked on label Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants