Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential problem with **get_specific_ion_radius**: type_energies and atoms in model may be indexed differently when ions are not processed into type_energies #1016

Open
rwxayheee opened this issue Sep 13, 2024 · 2 comments

Comments

@rwxayheee
Copy link

rwxayheee commented Sep 13, 2024

Hi all,

I have a question regarding function mmtbx.model.model.get_specific_ion_radius, which is used by probe's helper function getExtraAtomInfo for ions found from a model (manager), such as sodium cation (NA), chloride anion (CL), etc.

I noticed something unusual about get_specific_ion_radius which might be causing what I saw in #1011 as this was used in reduce's optimization routine. Here's a short example for the potential problem:

Input Block [1]

# Download PDB File
pdb_token = "1iep"
! curl "http://files.rcsb.org/view/{pdb_token}.pdb" -o "{pdb_token}.pdb"

# Get Model's Hierarchy
from iotbx.data_manager import DataManager

dm = DataManager()
filename_pdb = "1iep.pdb"
dm.process_model_file(filename_pdb)
model = dm.get_model(filename=filename_pdb)
hierarchy = model.get_hierarchy()
print(hierarchy.composition())

Output from [1]

group_args
  n_atoms                        : 4710
  n_chains                       : 6
  n_hd                           : 0
  n_nucleotide                   : 0
  n_nucleotide_atoms             : 0
  n_other                        : 8
  n_other_atoms                  : 80
  n_protein                      : 548
  n_protein_atoms                : 4458
  n_water                        : 172
  n_water_atoms                  : 172
  other_cnts                     : Counter({' CL': 6, 'STI': 2})

Input Block [2]

# Set some Environment Variables...
import os

os.environ["MMTBX_CCP4_MONOMER_LIB"] = "/Users/amyhe/Downloads/geostd"
# This came from https://github.com/phenix-project/geostd.git
os.environ["CLIBD_MON"] = "/Users/amyhe/Downloads/chem_data/mon_lib"
# This came from https://gitlab.com/phenix_project/chem_data

# Make Restraints...
import mmtbx
from mmtbx.hydrogens import reduce_hydrogen

# with reduce pdb interpretation
p = reduce_hydrogen.get_reduce_pdb_interpretation_params(use_neutron_distances=True)
model.process(make_restraints=True, pdb_interpretation_params=p)
print(f"Number of atoms in model._type_energies: {len(model._type_energies)}")
print(set(model._type_energies))

# with default pdb interpretation
p = mmtbx.model.manager.get_default_pdb_interpretation_params()
model.process(make_restraints=True, pdb_interpretation_params=p)
print(f"Number of atoms in model._type_energies: {len(model._type_energies)}")
print(set(model._type_energies))

Output from [2]

Number of atoms in model._type_energies: 4704
{'NH2', 'CR56', 'NC2', 'OC', 'CH1', 'CH3', 'NC1', 'OH1', 'CR6', 'N', 'O', 'C', 'NH1', 'CR16', 'CH2', 'CR15', 'NR15', 'NT', 'NT3', 'S', 'CR5', 'OH2'}
Number of atoms in model._type_energies: 4704
{'NH2', 'CR56', 'NC2', 'OC', 'CH1', 'CH3', 'NC1', 'OH1', 'CR6', 'N', 'O', 'C', 'NH1', 'CR16', 'CH2', 'CR15', 'NR15', 'NT', 'NT3', 'S', 'CR5', 'OH2'}

From here I found that the 6 CL atoms never entered model._type_energies with either interpretation (all of them are correctly tagged as Ion), and as a result, the CL ions get incorrect types because type_energies and atoms are indexed differently. But I'm a bit lost with how to repair the type_energies atom types, or to make i_seq in sync with the atom types.

Many thanks for looking into this issue and your kind advice. Please let me know if you need any additional details or if there’s anything else I can do to help resolve it.

@rwxayheee rwxayheee changed the title Potential problem with **get_specific_ion_radius**: type_energies and atoms in model may be indexed differently Potential problem with **get_specific_ion_radius**: type_energies and atoms in model may be indexed differently when ions are not processed into type_energies Sep 13, 2024
@nwmoriarty
Copy link
Contributor

nwmoriarty commented Sep 17, 2024 via email

@rwxayheee
Copy link
Author

rwxayheee commented Sep 23, 2024

Hi @nwmoriarty

Thanks so much for your kind reply, sorry for my late response. I tried different ways of manipulating the monomer library but they didn't work.. I'm a bit unfamiliar with the code base or any version change of the libraries, but I have a walkaround using the ad_hoc_single_atom_residue class in pdb_interpretation.py -

The cause of the problem was -
When monomer mapping returns None - the ith item corresponding to atoms[i_seq] in type_energies and type_h_bonds are skipped and may lead to the different indexing and change in lengths of model._type_energies and model._type_h_bonds. If the indexes are to be reused (for example by mmtbx.model.model.get_specific_ion_radius), the length needs to stay the same (at least with placeholders like None or False).

The walkaround is -
A CL (chloride ion) is interpreted as an ad_hoc_single_atom_residue. The energy_type for its atom can be registered in a similar manner to this:

nonbonded_energy_type_registry.assign_directly(
i_seq=i_seq, symbol=ad_hoc.energy_type)

While nonbonded_energy_type_registry.assign_directly assigns only symbols:

def assign_directly(self, i_seq, symbol):
self.symbols[i_seq] = symbol

I use ad_hoc.energy_type to assign energy_type. I don't know a quick way to get the h_bond_type though, but with the exception of CL and F, all other ad_hoc_single_atom_residues would have the type N.

For future reference -
I'm using exactly this ener_lib.cif from this repository:
https://github.com/phenix-project/geostd.git

There seem to be some version differences when I install cctbx-base by conda/mamba with python=3.10 vs. python=3.12. Here's how I installed cctbx-base (for the issue I reported):
micromamba create -n cctbx-nightly -c cctbx-nightly -c conda-forge cctbx-base python=3.12

I can close this issue if this is not reproducible with the full copy of Phenix. Thanks again for your time and kind advice, as always.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants