-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Brainstorm simulation API proposal #19
Comments
Pinging @peastman, who may have some useful suggestions. Some initial thoughts:
I'm not sure how I feel about adding the |
Tagging @peastman again. Would love to get your input here now, rather than when it's too late to change the design. |
Why would you want to subclass ForceField? I'm not sure what that would be intended to solve. For the integration and attempting moves, what about an API similar to the simulated tempering script: https://simtk.org/svn/pyopenmm/trunk/simtk/pyopenmm/extras/simulatedtempering.py. It wraps a Simulation object, then provides its own step() method. In this case, you could have a class that takes the same arguments as the Simulation constructor (system, integrator, etc.). It would then automatically generate the NCMC integrator and combine it with the user provided one in a CustomCompoundIntegrator. And when you called step() on this object, it would automatically switch back and forth between the two integrators to perform the MC moves as needed. |
The other option is to make changes to
We could do this, but is this the best approach? We could do all of that within |
Oh I see---it's not even in OpenMM proper. |
Right. I'm questioning whether constant pH belongs in a core class like Simulation, or whether it should be more self contained. It's still a very new technique, and in fact there are lots of different techniques for doing constant pH, and the techniques we use five years from now may be very different from the ones we use today. So instead of assuming one particular method, and deeply integrating it into Simulation, I'd rather keep it self contained. In the same way that different Integrators are self contained. That way if we later implement a different method that requires a very different API, we won't be stuck with an obsolete interface. |
That would argue for making it a |
Both Vijay and I are very enthusiastic to make it as easy as possible for people to use this technique generally. I definitely agree that we want some flexibility in the choice of options, but there could be multiple ways to accommodate that. |
That would also be an option, but it might require some really ugly hacks. It would mean one force would implement |
There could be other possible designs that don't relegate constant pH to a third-class scheme. What about adding a plugin architecture to |
Could you elaborate? What sort of design did you have in mind? |
I don't have a concrete design in mind, but from the user end, this could be something like pdb = app.PDBFile('input.pdb') # preprocessed with right residue names to indicate constph?
forcefield = app.ForceField('amber99cph.xml', 'tip3p.xml', 'ligandcph.xml')
system = forcefield.createSystem(pdb.topology, nonbondedMethod=app.PME,nonbondedCutoff=1.0*unit.nanometers, constraints=app.HBonds, rigidWater=True, ewaldErrorTolerance=0.0005)
integrator = mm.LangevinIntegrator(300*unit.kelvin, 1.0/unit.picoseconds, 2.0*unit.femtoseconds)
constph = app.ConstantPHPlugin(pH=7.4, interval=50)
simulation = app.Simulation(pdb.topology, system, platform, properties, plugins=[constph])
simulation.context.setPositions(pdb.positions) |
For that to work, I see two points at which the plugin would have to be called. First, it would get the opportunity to preprocess the arguments passed to Simulation, so it could replace the integrator. And second, it would have the ability to interrupt integration periodically and run its own code. The latter function would be very similar to a reporter, and in many cases would literally be a reporter. This also sounds like a possibility worth pursuing. |
This might be a naive question but, if this were to go on FAH, how would we handle the constantly changing topologies? Would we have to input a topology with virtual sites at every possible protonation site so that we can recover our trajectories at the end? |
@cxhernandez in the current setup, protons aren't removed from the topology, and the initial topology should have all possible protons included. During simulation, they're switched on and off by changing to and from dummy atoms. We change their non-bonded parameters, although bonded interactions are maintained so they don't drift away. |
👍 thanks for the clarification @bas-rustenburg! |
An important question is whether this should be implemented in Python or C++. If it's in Python, we wouldn't be able to use it in FAH, so if we wanted constant pH there, we would have to implement it separately as part of the core (but that could be a simplified version with fewer options than the more general OpenMM version). If it's in C++, it's impossible to handle parameter updating in any sort of general way. It would need hardcoded support for every Force subclass it could work with. If we decide to go with C++, what about implementing it as an Integrator rather than a Force? This would be less brittle and require fewer hacks. It could be similar to a CompoundIntegrator, in that you would create two other Integrators, one for it to use in ordinary integration, and one for it to use when attempting MC steps. Then you would just call step() on it as usual, and periodically it would throw in an MC step. |
@bas-rustenburg is working on adding code to ForceField and extending the ffxml format for these titratible forms of residues, as well as setting up a workflow for both titratible residues and small molecules. They would be fully protonated, as he suggests, and the parameters would change dynamically. The Topology would remain constant. @peastman: Is there a way to circumvent the ugly hacks through a clever choice of API? There wouldn't be an easy way to get around the need to update other forces, unfortunately. In some respects, this is similar to how MonteCarloBarostat can make changes that modify the potential, but I realize it is also quite different if we are modifying force parameters directly. I wonder if we could cook up a simpler scheme based on modifying context parameters. We may also have to come up with extensions to CustomNonbondedForce that allow us to swap in different parameters for residues based on context parameters. |
I don't think it's practical to do this with global parameters. First, you'd be restricted to only Forces that use global parameters, so no NonbondedForce. That's ok for implicit solvent, since you could do that with only custom forces, but PME would be out. But even assuming you wanted to use only CustomNonbondedForce and CustomGBForce, what would you do? There would be possibly dozens of parameters, each controlling the state of one site. And then you would have to write a huge expression for the energy, involving all of those dozens of global parameters. It would be insanely inefficient.
What about my suggestion of doing it through an integrator? That would be a lot less hacky, since it could work entirely by calling public methods on the Forces at times when they expected those methods to be called. |
Apologies for not responding to this one earlier---I hadn't seen it when I posted my last comment.
I think we need to find a way to get the fundamental capabilities into the C++ layer in order for this to be easily supported on FAH.
I'm still trying to think of whether there is some way to make this general while not requiring extensions to all
This is a really interesting idea. Were you thinking that this integrator would be specific to constant pH, or that it would be somehow more general/programmable? Could it be combined with The algorithm would roughly go something like this:
There are currently more efficient ways to do this that involve duplicating the titratable residues and using a small set of global parameters to select the titration state of each one. In the |
Out of curiosity, @peastman, why would this be much cleaner as an |
It doesn't require changes to the Forces, but it does require the integrator to have hardcoded support for every type of force it can modify. It needs to have one block of code for updating a NonbondedForce, a different block for a GBSAOBCForce, and so on.
It would be specific to constant pH. All of the logic you outlined above would be built into it.
Modifying box vectors is one of the things Forces are explicitly allowed to do inside updateContextState(). And the code to support that is entirely contained within Context. Modifying other Force objects is not something they're allowed to do, and there are lots of subtle ways in which Force classes might implicitly assume they wouldn't get modified then. That's not to say it couldn't be made it work, but we'd have to go through every force one by one (in fact, every platform's implementation of every force) and verify that it wasn't doing anything this would break. Even worse, we're not just modifying Forces inside updateContextState(). We're running integration, which will of course involve its own calls to updateContextState(). So that will be getting called recursively, which could be really painful to deal with. Implementing it as an Integrator doesn't have any of these problems. It just switches between different Integrators in exactly the way that CompoundIntegrator already does. |
I don't see how that can be simpler than what I described. You're going to need one such parameter for every site, right? |
Yes, but you do only need two global parameters per energy expression. The energy expressions don't need to involve all the global parameters as you suggested ("And then you would have to write a huge expression for the energy, involving all of those dozens of global parameters."). I think the integrator approach is better for trying to make this a first-class feature, though. I'm thinking about what the API should look like. |
In thinking about an API, how about this? The user would interact with a Python class modeller = Modeller(pdb.topology)
forcefield = ForceField('amber99sbildn.xml', 'amber99sbildn-constph.xml', 'tip3p.xml')
modeller.addHydrogens(forcefield=forcefield, pH=7) # needs to be aware that fully-protonated titratable forms of residues are used
system = forcefield.createSystem(topology, nonbondedMethod=PME)
my_integrator = LangevinIntegrator(temperature, collision_rate, timestep)
ph_integrator = MonteCarloTitration(topology, system, forcefield, pH=7, switching_steps=50)
# Calibrate constant-pH reference energies (takes a while, creates/destroys Contexts)
ph_integrator.calibrate(tol=0.01)
# Create a CompoundIntegrator where we run 50 steps of Langevin followed by a constant-pH titration update
integrator = CompoundIntegrator()
integrator.addIntegrator(my_integrator)
integrator.addIntegrator(ph_integrator)
integrator.setSchedule([50, 1]) # new method to allow automated integrator step sequence to be selected
# Create context.
context = Context(system, integrator)
# Run simulation
integrator.step(5000) By adding a method such as Before diving into the details of what a
(Edit: Added |
I think that design tries to put too much of the logic into Python. That won't work for FAH. And there's a lot more to constant pH than just switching back and forth between two integrators on a regular schedule. I'm thinking more along these lines: integrator1 = LangevinIntegrator(300.0, 1.0, 0.002) # the main integrator
integrator2 = VerletIntegrator(0.002) # used during MC steps
integrator = ConstantPHIntegrator(integrator1, integrator2, 100, 50) # attempt MC move every 100 steps, and each one involves 50 steps with integrator2 When you call
|
It will work just fine for FAH. I was just describing what this would look like from the user perspective who would be setting things up in Python. We haven't reached the point of discussing the C++ level integrator yet. The C++ level integrator would look almost exactly like the proposal here. If the user makes use of the high-level Python code to set things up, it hides much of the complexity and automates much of what is needed to set up the C++ level integrator with the appropriate titration states defined in the |
Why would we do that if the alternation between integrators is already the job of |
I'm presuming here that serializing |
Like I said, there's way more to this than just alternating between integrators. Its job includes:
That's why we need a special Integrator class, not just a trivial CompoundIntegrator. |
Yes, it has to do all of those things. It should do all of those things inside a I am specifically saying we should not use your proposed API for integrator = ConstantPHIntegrator(integrator1, integrator2, 100, 50) because
|
We still need to let them supply their own integrator. What if the system involves Drude particles? That needs a special integrator. What if they've constructed a CustomIntegrator that does things in a nonstandard way (such as freezing certain degrees of freedom)? They need that. What if they want to use an MTSIntegrator to do it more efficiently? We should allow that. There are many ways of doing constant energy integration, and good reasons people might use a different class than VerletIntegrator to do it. And a very important point I've said a few times: MC steps should appear as instantaneous changes between time steps. Performing an MC step should not consume a call to
Nothing stopping them from doing it. This integrator can easily be nested inside a CompoundIntegrator, or the "main integrator" passed as the first argument could be a CompoundIntegrator. |
Here's a proposed API for the C++ layer:
|
We can't support other classes than velocity Verlet for NCMC because we don't have a generalizable way to compute the nonequilibrium work built into the other integrators. Unless we build in an automated way to compute the differential path action for any integrator allowed in OpenMM, we can't let people specify their own integrators.
They can't. These will not be supported. This is not just an issue of doing constant energy integration. This is an issue of computing the appropriate nonequilibrium work for arbitrary integrators, which is hard. We generalized this as much as possible in this paper, which you're welcome to read if you want to understand what we're up against here. It's possible there's a simple, more general route than we've realized to compute differential path actions for arbitrary integrators---which would be totally awesome---but it's generally difficult to do this for all but the simplest of integrators. We could work this out and build it into some of the popular integrators in OpenMM---and this would be useful for all sorts of things---but this would be a major undertaking.
Why? Can't
It is weird to think of nesting nested integrators. That could lead to all sorts of non-obvious problems. |
It would still consume a call to
Like what? I don't see any problems with it. It's not even as if the user was having to switch the constant pH integrator between different child integrators. We would simply take the "main integrator" and wrap it in a layer that adds constant pH functionality to it. Then they could use that integrator directly, or stick it inside a CompoundIntegrator exactly like any other integrator. |
Apologies for the delayed reply---spent all day yesterday on a plane.
Could we extend the
I'm still trying to think of odd corner cases here, but the biggest reason to not do this is that---at least in the current design---it just isn't necessary to add that ind of complexity. However: I've been thinking about your desire to support more diversity in integrators, and wonder if we could do this by requiring that the NCMC integrator be a I'll modify my API proposals to account for this: At the Python app level, the user would interact this way: modeller = Modeller(pdb.topology)
forcefield = ForceField('amber99sbildn.xml', 'amber99sbildn-constph.xml', 'tip3p.xml')
modeller.addHydrogens(forcefield=forcefield, pH=7) # needs to be aware that fully-protonated titratable forms of residues are used
system = forcefield.createSystem(topology, nonbondedMethod=PME)
my_integrator = LangevinIntegrator(temperature, collision_rate, timestep)
ph_integrator = MonteCarloTitration(topology, system, forcefield, pH=7, switching_steps=50)
# Calibrate constant-pH reference energies (takes a while, creates/destroys Contexts)
ph_integrator.calibrate(tol=0.01)
# Create a CompoundIntegrator where we run 50 steps of Langevin followed by a constant-pH titration update
integrator = CompoundIntegrator()
integrator.addIntegrator(my_integrator)
integrator.addIntegrator(ph_integrator)
integrator.setSchedule([50, 1]) # new method to allow automated integrator step sequence to be selected
# Create context.
context = Context(system, integrator)
# Run simulation
integrator.step(5000) Under the hood, this would configure the C++ integrator using this API (written here using the Python wrappers for the API just for convenience): # Create the integrator (which by default uses instantaneous Monte Carlo)
integrator = simtk.openmm.MonteCarloTitrationIntegrator(temperature)
# Tell integrator to use NCMC.
# Setting nsteps = 0 switches back to using instantaneous MC
# The provided integrator must be a `CustomIntegrator` that provides a `work` global variable
integrator.setSwitching(nsteps, ncmc_integrator)
# Configure which Force terms should be updated
integrator.updateNonbondedForce(True) # integrator will update NonbondedForce during trials
integrator.updateCustomGBForce(True) # integrator will update CustomGBForce during trials
# Add definition of first titratable group of atoms
titratration_group_index = integrator.addTitratableGroup(atom_indices)
titration_state_index = integrator.addTitrationState(titration_group_index, log_solvent_population1)
integrator.setNonbondedForceParameters(titration_group_index, titration_state_index, particle_index, charge, sigma, epsilon)
integrator.setCustomGBForceParameters(titration_group_index, titration_state_index, params)
...
# There are also equivalent `get` and `set` methods for all of these options.
# lots and lots of additional parameters and titration states are defined in real cases
# Deal with MC statistics
[nattempted, naccepted] = integrator.getStatistics() # get list of number of attempted and accepted titration steps for each site
integrator.resetStatistics() # reset statistics
# Other getters and setters are needed to update temperature in the integrator and in the simulation |
Sorry for not responding sooner! Anyway, I still don't understand your objection to having the constant pH integrator be a top level one rather than something that needs to go into a CompoundIntegrator. All of this code: ph_integrator = MonteCarloTitration(topology, system, forcefield, pH=7, switching_steps=50)
integrator = CompoundIntegrator()
integrator.addIntegrator(my_integrator)
integrator.addIntegrator(ph_integrator)
integrator.setSchedule([50, 1]) # new method to allow automated integrator step sequence to be selected can be simplified to just this: integrator = MonteCarloTitrationIntegrator(topology, system, forcefield, pH=7, switching_steps=50, integrator=my_integrator) For the C++ API, I was thinking of something more generic that will be easier to extend in the future. Something like this: integrator.setStateParameter(force, titration_group_index, titration_state_index, particle_index, "charge", charge) The implementation would need hardcoded support for NonbondedForce, CustomNonbondedForce, etc., but that doesn't need to be in the API. Also, it's important to explicitly tell it which Force object to modify, since there might be multiple CustomNonbondedForces. |
I'm just worried this will be limiting by requiring that I wrap another integrator in case I want to do something more complex. For example, if I am trying to do some other form of NCMC as well, I might get into trouble if the constant-pH moves can't be temporarily disabled. What about allowing a
|
That would be for the case where you aren't doing NCMC, so the moves should just be attempted as a direct jump in parameters? Yes, that's about what I was figuring on. |
So we can discuss it here:
The text was updated successfully, but these errors were encountered: