-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
open an sdf file #154
Comments
We use Chemfiles for the file reading and residues are not currently supported there for SDF (https://chemfiles.org/chemfiles/latest/formats.html#list-of-supported-formats), so the setup fails. If you want to read in data from the PDB you can use PDB format, see the example at https://juliamolsim.github.io/Molly.jl/stable/docs/#Simulating-a-protein, though currently atom names must exactly match the residue templates. This is a topic of future work. |
And with a PDB file, how can I make the atom names match exactly while also indicating that multiple hydrogen atoms are connected to the same molecule? For example, if we follow the last example from https://www.biostat.jhsph.edu/~iruczins/teaching/260.655/links/pdbformat.pdf we see that certain hydrogen atoms will be named as "1HG1". I take it that's not supported, and I should manually place the atoms and fix up the topology? Is there yet another format that is better supported? Do hydrogen atoms follow a different convention in molly.jl? |
The residue template in the force field xml file defines the atom names (which currently must be matched exactly) and the connectivity between them. For example for alanine in
The PDB file would have to have ALA as the residue name and every atom name appearing exactly once for a residue to match this template. Future work will allow flexibility in residue names and atom names by searching for the best template. At the minute you are probably best off removing all hydrogens and adding them back with OpenMM, which will give atom names consistent with OpenMM XML force field files. All residues will need a template available, though that is the case with most software. |
I think I correctly implemented alanine for use with ff99
However, when I create the system and look at the interactions: then I only find |
Yes there should, I think this a problem with Chemfiles only reading bonds for non-standard residues when there is a CONECT record in the PDB file. I can add a mention to the docs, but in general non-standard residues (including ACE/NME terminal caps) aren't tested yet. See https://github.com/noeblassel/SINEQSummerSchool2023/blob/main/notebooks/dipeptide_nowater.pdb for an alanine dipeptide that reads in okay: using Molly
ff_dir = joinpath(dirname(pathof(Molly)), "..", "data", "force_fields")
ff = MolecularForceField(joinpath.(ff_dir, ["ff99SBildn.xml", "tip3p_standard.xml"])...)
sys = System("dipeptide_nowater.pdb", ff; rename_terminal_res=false) |
One more issue:
prints 1128.45113918646 kJ mol⁻¹ the python equivalent
prints Quantity(value=547.6336059570312, unit=kilojoule/mole) Other quantities are also incorrect, so presumably my file is still being read in incorrectly? Or was a different convention used between openmm and molly? |
It seems okay, the equivalent OpenMM call would be: import openmm
from openmm import app, unit
forcefield = app.ForceField('amber/ff14SB.xml')
f = app.PDBFile('big_alanine.pdb')
system = forcefield.createSystem(
f.getTopology(),
nonbondedMethod=app.CutoffPeriodic,
nonbondedCutoff=1*unit.nanometer,
constraints=None,
rigidWater=False,
switchDistance=None,
useDispersionCorrection=False,
)
integrator = openmm.LangevinMiddleIntegrator(310 * unit.kelvin,1.0 / unit.picosecond,1 * unit.femtosecond)
sim = app.Simulation(f.getTopology(), system, integrator)
sim.context.setPositions(f.getPositions())
state = sim.context.getState(getEnergy=True)
state.getPotentialEnergy()
Which is much better, there is still a 3E-4 kJ/mol discrepancy. Note you need a line like
in the PDB file for |
that clarifies some stuff! I'm really struggling with basic things but from testing it looks like molly is at least a factor 10 faster than openmm (for my admittedly weird use case). |
Cool, I'd be interested to hear roughly what you are doing if you are able to share, we might be able to add more features and docs to help. Btw make sure you are on Molly.jl v0.18.2 or later as that version had a significant performance improvement for periodic boundary conditions. |
The plan is to wrap it up in a paper (though the draft is progressing slowly), so molly will definitely be cited! We really only need a way to evaluate energies and manipulate positions. Molly is nice because unlike pythonic packages that bind to optimized C code, I can directly profile and manipulate the system structure. At the moment I'm only interested in all-to-all, but there are two other projects I want to try molly on, one which requires mixing periodic boundary conditions in one direction with open boundary conditions in another, and another where I'll be using conventional periodic boundary conditions. Thanks a lot for the hard work that went into the package! It was bit rough to get started simulating my first molecule, but after that it's really only been smooth sailing! |
It is not clear to me how I can generally set up a working system. As a simple example, it is not possible to set up a system starting from an sdf file:
System("water.sdf",ff)
straight up fails.
Is there somewhere an example where I start with a practical system (say a compound from https://www.rcsb.org ) and set up an MD simulation?
The text was updated successfully, but these errors were encountered: