Skip to content

Commit

Permalink
Merge pull request #23 from chemosim-lab/review1
Browse files Browse the repository at this point in the history
### Added
- Improved the documentation on how to properly restrict interactions to ignore the protein backbone (Issue #22), how to fix the empty dataframe issue when no bond information is present in the PDB file (Issue #15), how to save the LigNetwork diagram (Issue #21), and some clarifications on using `fp.generate`

### Fixed
- Mixing residue type with interaction type in the interactive legend of the LigNetwork would incorrectly display/hide some residues on the canvas
- MOL2 files starting with a comment (`#`) would lead to an error
  • Loading branch information
cbouy authored Aug 2, 2021
2 parents 3de6aa6 + 7d5cd92 commit 88c1f01
Show file tree
Hide file tree
Showing 10 changed files with 228 additions and 93 deletions.
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]
### Added
- Improved the documentation on how to properly restrict interactions to ignore the
protein backbone (Issue #22), how to fix the empty dataframe issue when no bond
information is present in the PDB file (Issue #15), how to save the LigNetwork diagram
(Issue #21), and some clarifications on using `fp.generate`
### Changed
### Deprecated
### Removed
### Fixed
- Mixing residue type with interaction type in the interactive legend of the LigNetwork
would incorrectly display/hide some residues on the canvas (#PR 23)
- MOL2 files starting with a comment (`#`) would lead to an error

## [0.3.3] - 2021-06-11
### Changed
Expand Down
60 changes: 29 additions & 31 deletions docs/notebooks/how-to.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@
"metadata": {},
"outputs": [],
"source": [
"fp = plf.Fingerprint()\n",
"fp = plf.Fingerprint([\"Hydrophobic\"])\n",
"fp.hydrophobic(lmol, pmol[\"TYR109.A\"])"
]
},
Expand All @@ -137,10 +137,17 @@
"class Hydrophobic(plf.interactions.Hydrophobic):\n",
" pass\n",
"\n",
"fp = plf.Fingerprint()\n",
"fp = plf.Fingerprint([\"Hydrophobic\"])\n",
"fp.hydrophobic(lmol, pmol[\"TYR109.A\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can then use `fp.run` and other methods as usual."
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -165,7 +172,7 @@
" def __init__(self):\n",
" super().__init__(distance=4.0)\n",
" \n",
"fp = plf.Fingerprint()\n",
"fp = plf.Fingerprint([\"Hydrophobic\", \"CustomHydrophobic\"])\n",
"fp.hydrophobic(lmol, pmol[\"TYR109.A\"])"
]
},
Expand All @@ -184,7 +191,6 @@
"metadata": {},
"outputs": [],
"source": [
"fp = plf.Fingerprint([\"Hydrophobic\", \"CustomHydrophobic\"])\n",
"fp.bitvector(lmol, pmol[\"TYR109.A\"])"
]
},
Expand Down Expand Up @@ -249,7 +255,7 @@
" return True, res1_i[0], res2_i[0]\n",
" return False, None, None\n",
"\n",
"fp = plf.Fingerprint()\n",
"fp = plf.Fingerprint([\"CloseContact\"])\n",
"fp.closecontact(lmol, pmol[\"ASP129.A\"])"
]
},
Expand All @@ -261,11 +267,11 @@
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Call `fp.to_dataframe(return_atoms=True)`"
],
"cell_type": "markdown",
"metadata": {}
]
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -338,6 +344,8 @@
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Working with docking poses instead of MD simulations\n",
"\n",
Expand All @@ -346,9 +354,7 @@
"Please note that this part of the tutorial is only suitable for interactions between one protein and several ligands, or in more general terms, between one molecule with multiple residues and one molecule with a single residue. This is not suitable for protein-protein or DNA-protein interactions.\n",
"\n",
"Let's start by loading the protein. Here I'm using a PDB file but you can use any format supported by MDAnalysis as long as it contains explicit hydrogens."
],
"cell_type": "markdown",
"metadata": {}
]
},
{
"cell_type": "code",
Expand All @@ -363,11 +369,11 @@
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using an SDF file"
],
"cell_type": "markdown",
"metadata": {}
]
},
{
"cell_type": "code",
Expand All @@ -386,15 +392,15 @@
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Please note that converting the `lig_suppl` to a list is optionnal (and maybe not suitable for large files) as it will load all the ligands in memory, but it's nicer to track the progression with the progress bar.\n",
"\n",
"If you want to calculate the Tanimoto similarity between your docked poses and a reference ligand, here's how to do it.\n",
"\n",
"We first need to generate the interaction fingerprint for the reference, and concatenate it to the previous one"
],
"cell_type": "markdown",
"metadata": {}
]
},
{
"cell_type": "code",
Expand Down Expand Up @@ -449,13 +455,13 @@
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using a MOL2 file\n",
"\n",
"The input mol2 file can contain multiple ligands in different conformations."
],
"cell_type": "markdown",
"metadata": {}
]
},
{
"cell_type": "code",
Expand Down Expand Up @@ -535,20 +541,12 @@
],
"metadata": {
"kernelspec": {
"name": "python385jvsc74a57bd0ed16e0ce086f53f6a3b96f2d7e8fdc3cba2fa42f4f858ca7715a8f0f47550c6a",
"display_name": "Python 3.8.5 64-bit ('prolif': conda)"
"display_name": "Python 3.8.5 64-bit ('prolif': conda)",
"name": "python385jvsc74a57bd0ed16e0ce086f53f6a3b96f2d7e8fdc3cba2fa42f4f858ca7715a8f0f47550c6a"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
"version": ""
}
},
"nbformat": 4,
Expand Down
120 changes: 82 additions & 38 deletions docs/notebooks/protein-protein_interactions.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,132 +2,176 @@
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Protein-protein interactions\n",
"\n",
"This notebooks shows how to compute a fingerprint for protein-protein interactions.\n",
"\n",
"Here we will investigate the interactions in a G-protein coupled receptor (GPCR) between a particular helix (called TM3) and the rest of the protein.\n",
"\n",
"This can obviously be applied to proteins that don't belong to the same chain/segment, as long as you can figure out an appropriate [MDAnalysis selection](https://docs.mdanalysis.org/stable/documentation_pages/selections.html)"
]
"This can obviously be applied to proteins that don't belong to the same chain/segment, as long as you can figure out an appropriate [MDAnalysis selection](https://docs.mdanalysis.org/stable/documentation_pages/selections.html)\n",
"\n",
"There is also an example at the end of this tutorial for generating an IFP of PPI without considering the backbone."
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"source": [
"import MDAnalysis as mda\n",
"import prolif as plf"
],
"outputs": [],
"metadata": {
"ExecuteTime": {
"end_time": "2020-03-12T18:24:55.324128Z",
"start_time": "2020-03-12T18:24:54.314383Z"
},
"tags": []
},
"outputs": [],
"source": [
"import MDAnalysis as mda\n",
"import prolif as plf"
]
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# load traj\n",
"u = mda.Universe(plf.datafiles.TOP, plf.datafiles.TRAJ)\n",
"tm3 = u.select_atoms(\"resid 119:152\")\n",
"prot = u.select_atoms(\"protein and not group tm3\", tm3=tm3)\n",
"tm3, prot"
]
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true,
"tags": []
},
"outputs": [],
"source": [
"# prot-prot interactions\n",
"fp = plf.Fingerprint([\"HBDonor\", \"HBAcceptor\", \"PiStacking\", \"PiCation\", \"CationPi\", \"Anionic\", \"Cationic\"])\n",
"fp.run(u.trajectory[::10], tm3, prot)"
]
],
"outputs": [],
"metadata": {
"scrolled": true,
"tags": []
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df = fp.to_dataframe()\n",
"df.head()"
]
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# show interactions for a specific ligand residue\n",
"df.xs(\"ARG147.A\", level=\"ligand\", axis=1).head(5)"
]
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# same for a protein residue\n",
"df.xs(\"GLU309.B\", level=\"protein\", axis=1).head(5)"
]
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# display a specific type of interaction\n",
"df.xs(\"Cationic\", level=\"interaction\", axis=1).head(5)"
]
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# calculate the occurence of each interaction on the trajectory\n",
"occ = df.mean()\n",
"# restrict to the frequent ones\n",
"occ.loc[occ > 0.3]"
]
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# regroup all interactions together and do the same\n",
"g = (df.groupby(level=[\"ligand\", \"protein\"], axis=1)\n",
" .sum()\n",
" .astype(bool)\n",
" .mean())\n",
"g.loc[g > 0.3]"
]
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"## Ignoring backbone interactions\n",
"\n",
"In some cases, you might want to dismiss backbone interactions. While it might be tempting to just modify the MDAnalysis selection with `\"protein and not backbone\"`, this won't work as expected and will lead to adding a charges where the backbone was bonding with the sidechain. \n",
"However there is a temporary workaround (which will be directly included in the code in the near future):"
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"source": [
"from rdkit import Chem\n",
"from rdkit.Chem import AllChem\n",
"from tqdm.auto import tqdm\n",
"\n",
"# remove backbone\n",
"backbone = Chem.MolFromSmarts(\"[C^2](=O)-[C;X4](-[H])-[N;+0]\")\n",
"fix_h = Chem.MolFromSmarts(\"[H&D0]\")\n",
"\n",
"def remove_backbone(atomgroup):\n",
" mol = plf.Molecule.from_mda(atomgroup)\n",
" mol = AllChem.DeleteSubstructs(mol, backbone)\n",
" mol = AllChem.DeleteSubstructs(mol, fix_h)\n",
" return plf.Molecule(mol)\n",
"\n",
"# generate IFP\n",
"ifp = []\n",
"for ts in tqdm(u.trajectory[::10]):\n",
" tm3_mol = remove_backbone(tm3)\n",
" prot_mol = remove_backbone(prot)\n",
" data = fp.generate(tm3_mol, prot_mol)\n",
" data[\"Frame\"] = ts.frame\n",
" ifp.append(data)\n",
"df = plf.to_dataframe(ifp, fp.interactions.keys())\n",
"df.head()"
],
"outputs": [],
"metadata": {}
}
],
"metadata": {
"hide_input": false,
"kernelspec": {
"name": "python385jvsc74a57bd0ed16e0ce086f53f6a3b96f2d7e8fdc3cba2fa42f4f858ca7715a8f0f47550c6a",
"display_name": "Python 3.8.5 64-bit ('prolif': conda)"
"display_name": "Python 3.8.5 64-bit ('prolif': conda)",
"name": "python385jvsc74a57bd0ed16e0ce086f53f6a3b96f2d7e8fdc3cba2fa42f4f858ca7715a8f0f47550c6a"
},
"language_info": {
"codemirror_mode": {
Expand Down
Loading

0 comments on commit 88c1f01

Please sign in to comment.