Skip to content

Commit

Permalink
Update docs source
Browse files Browse the repository at this point in the history
  • Loading branch information
fjclark committed May 10, 2024
1 parent 3fdf47a commit 86615ce
Show file tree
Hide file tree
Showing 7 changed files with 240 additions and 3 deletions.
22 changes: 22 additions & 0 deletions docs/cli_tutorials.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
CLI Tutorials
==============

maize-biosimspace provides a large number of nodes for performing common operations such as parameterisation, minimisation, equilibration, production molecular
dynamics, and alchemical free energy calculations. Nodes are generally engine-specific as maize requires that nodes have a list of ``required_callables`` whose presence
in the environment is checked before the node is run. To see a list of all available nodes, type ``bss_`` and hit tab to list them. To see the options for each, pass
pass the ``-h`` flag e.g.

.. code-block:: bash
bss_parameterise -h
These tutorials give specific examples of using BioSimSpace maize nodes to run production molecular dynamics, to create equilibrated systems starting from unparameterised
input structures, and to run absolute binding free energy calculations starting from a protein pdb and an sdf file containing multiple ligands.

.. toctree::
:maxdepth: 1

tutorial_cli_production_md
tutorial_cli_system_preparation
tutorial_cli_abfe

7 changes: 4 additions & 3 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,11 @@ maize-biosimspace

.. toctree::
:hidden:
:maxdepth: 1
:caption: Examples
:maxdepth: 2
:caption: Tutorials

production-md
cli_tutorials
python_tutorials

.. toctree::
:hidden:
Expand Down
13 changes: 13 additions & 0 deletions docs/python_tutorials.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Python Tutorials
================

Nodes and subgraphs from maize-biosimspace can be combined into more complex
workflows through python, and can benefit from other nodes available through
maize-contrib. You can also use pre-made maize-biosimspace workflows through python
to gain fuller control over all the options (compared to the CLI) and to write
more reusable scripts.

.. toctree::
:maxdepth: 1

tutorial_python_production
55 changes: 55 additions & 0 deletions docs/tutorial_cli_abfe.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
Absolute Binding Free Energy Calculations
=========================================

Here, we'll run a quick absolute binding free energy calculation for benzene bound to T4 Lysozyme. For this, we'll
use the ``bss_abfe_multi_isomer`` workflow through its CLI, which requires only an SDF file containing all required
ligands, and the pdb of the protein. Check the options with

.. code-block:: bash
bss_abfe_multi_isomer -h
Copy over the required input:

.. code-block:: bash
mkdir bss_abfe_example
cd bss_abfe_example
cp ../tests/data/benzene.sdf .
cp ../tests/data/t4l.pdb .
Now, let's run a relatively very short (but still fairly expensive)
ABFE calculation with 2 replicates:

.. code-block:: bash
bss_abfe_multi_isomer --lig_sdfs_file benzene.sdf \
--protein_pdb benzene.t4l \
--ligand_force_field gaff2 \
--protein_force_field ff14SB \
--abfe_timestep 4 \
--abfe_n_replicates 2 \
--abfe_runtime 0.1 \
--abfe_runtime_generate_boresch_restraint 0.1 \
--prep_runtime_restrained_npt 0.05 \
--prep_runtime_unrestrained_npt 0.05 \
--abfe_estimator TI \
--results_file_name abfe_out \
The ``abfe_out`` file should show results around -4 kcal / mol.

Running through the command line with this many arguments is unweildy,
and some options aren't available through the CLI (for example, the lambda
spacing). It's likely a better option to write a quick script - using the
pre-made workflow directly in python - simply import the workflow factory
, customise the options, and run (all in a python script):

.. code-block:: python
from maize.graphs.exs.biosimspace.afe import getabfe_multi_isomer_workflow
workflow = getabfe_multi_isomer_workflow()
# Set workflow options...
# Run
workflow.execute()
30 changes: 30 additions & 0 deletions docs/tutorial_cli_production_md.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Production Molecular Dynamics
=============================

Here, we'll run production molecular dynamics on a protein-ligand complex. To check the available production CLIs, type ``bss_production`` and hit tab:

.. code-block:: bash
bss_production_gromacs bss_production_pmemd bss_production_pmemd_cuda bss_production_sander bss_production_somd
We'll pick gromacs. To check the available arguments and defaults, run

.. code-block:: bash
bss_production_gromacs -h
We'll run a quick 0.1 ns of production molecular dynamics on the protein-ligand complex included with `maize-biosimspace` for testing. We'll specify the output
name to be ``gmx_md_out`` and we'll save all of the intermediate files (including input scripts, logs, and trajectory files) to a subdirectory in the current
working directory by specifying ``--dump_to .``

.. code-block:: bash
mkdir gmx_md_example
cd gmx_md_example
cp ../tests/data/complex.* .
bss_production_gromacs --inp complex.prm7 complex.rst7 --runtime 0.1 --save_name gmx_md_out --dump_to .
You should now have the final coordinate file, ``gmx_md_out.rst7``, a copy of the input topology file ``gmx_md_out.prm7``, and a sub-directory containing all of the
intermediate files. Note that despite running through GROMACS, we were able to pass in AMBER files as input. This is because BioSimSpace automatically converts
between file formats (using Sire under the hood).

53 changes: 53 additions & 0 deletions docs/tutorial_cli_system_preparation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
System Preparation
==================

Often, we have structure files for a protein and / or ligand and we would like to parameterise, solvate, minimise, heat, and equilibrate them to obtain systems
suitable for production molecular dynamics simulations or free energy calculations. While individual nodes are provided for all of these steps,
`maize-biosimspace` also provides CLIs for two complete system preparation workflows: ``bss_system_prep_free``, which is designed to prepare a ligand in a box
of water (but can also be used for an apo protein) and ``bss_system_prep_bound``, which is designed to set up protein-ligand complexes.

Here, we'll prepare the complex of T4 lysozyme L99A bound to benzene, a very common test system for absolute binding free energy calculations. First, we'll copy
over the required input:

.. code-block:: bash
mkdir sysprep_bound_example
cd sysprep_bound_example
cp ../tests/data/benzene.sdf .
cp ../tests/data/t4l.pdb .
.. tip::

This pdb has been sanitised and will work with BioSimSpace first time, but often pdbs will require some tweaking before they are accepted by ``tleap`` (which
BioSimSpace uses behind the scenes). The recommended workflow is

* Clean your unsanitised pdb using pdb4amber, e.g. ``pdb4amber -i protein.pdb -o protein_sanitised.pdb``
* Attempt to run the workflow below
* If the workflow raises an error, attempt parameterisation directly with ``tleap`` to get more detailed error messages. E.g., type ``tleap``, then

.. code-block:: bash
source leaprc.protein.ff14SB
source leaprc.water.tip3p
# Loading an unsanitised pdb will likely raise an error
prot = loadpdb protein_sanitised.pdb
saveamberparm prot protein.parm7 protein.rst7
savepdb prot protein_fully_sanitised.pdb
* If the above fails, this is often due to residue/ atom names which do not match the templates. Read the errors to find out which residues / atoms are causing the issues, then check the expected names in library which was loaded after typing ``source leaprc.protein.ff14SB`` e.g. ``cat $AMBERHOME/dat/leap/lib/amino12.lib``. Rename the offending atoms/ residues and repeat the above step.

BioSimSpace is very fussy about parameterisation and will fail if tleap raises any warnings. To get round this, run the tleap script above and use the
``protein_full_sanitised.pdb`` file as your input, which will not raise any errors.

To run system preparation for our protein-ligand complex, we'll use the ``bss_system_prep_bound`` CLI, saving the output system to "t4l_benzene_complex_equilibrated"
and using the gaff2 ff14SB force fields. There are a large number of other parameters which can be modified (see ``bss_system_prep_bound -``) but we'll run the
defaults for now.

.. note::

Make sure that you have access to a GPU locally, or have configured `Maize` to submit to a queue with gpu access (see :doc:`configuration`)

.. code-block:: bash
bss_system_prep_bound --inp benzene.sdf --protein_pdb t4l.pdb --ligand_force_field gaff2 --protein_force_field ff14SB --save_name t4l_benezene_complex_equilibrated
63 changes: 63 additions & 0 deletions docs/tutorial_python_production.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
Creating a Workflow with Production MD
=======================================

Here, we'll look at a basic example of a custom workflow which
uses ``pmemd.cuda`` to run some production MD. In reality, you
would want to add some extra steps before or afterwards:

.. code-block:: python
"""Run production Molecular Dynamics using PMEMD.CUDA through BioSimSpace."""
from pathlib import Path
from maize.core.workflow import Workflow
from maize.steps.exs.biosimspace import ProductionPmemdCuda
from maize.steps.io import LoadData, Return
from maize.utilities.execution import JobResourceConfig
# Build the graph
flow = Workflow(name="Prod_BSS_AMBER_Test", cleanup_temp=False, level="debug")
# Add the nodes
load_sys = flow.add(LoadData[list[Path]])
prod_pmemd = flow.add(
ProductionPmemdCuda,
name="Production_Amber",
parameters={
"runtime": 1.0, # ns
},
)
retu = flow.add(Return[list[Path]])
# Set parameters
load_sys.data.set(
[
Path(
"< path to complex.prm7>" # CHANGEME
),
Path(
"< path to complex.rst7>" # CHANGEME
),
]
)
# Connect the nodes
flow.connect(load_sys.out, prod_pmemd.inp)
flow.connect(prod_pmemd.out, retu.inp)
# Check and run!
flow.check()
flow.visualize()
flow.execute()
mols = retu.get()
# Load a BioSimSpace system from the returned paths
import BioSimSpace as BSS
sys = BSS.IO.readMolecules([str(mols[0]), str(mols[1])])
print(40 * "#")
print(sys)
# In reality, you would do something here...

0 comments on commit 86615ce

Please sign in to comment.