Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for interactions with DNA #8

Closed
cbouy opened this issue Jan 13, 2021 · 1 comment
Closed

Better support for interactions with DNA #8

cbouy opened this issue Jan 13, 2021 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@cbouy
Copy link
Member

cbouy commented Jan 13, 2021

The "residue" letter codes for DNA is slightly different to the convention used for proteins: it uses 2-letter codes instead of 3. While this isn't a problem for creating prolif.Molecule objects and generating a fingerprint, it might cause some issues when indexing the molecule to look for a particular residue, as shown below:

>>> import prolif as plf
>>> import MDAnalysis as mda
>>> u = mda.Universe("polyAT_vac.prmtop")
>>> dmol = plf.Molecule.from_mda(u)
>>> dmol.residues.name
array(['DA5', 'DA', 'DA', 'DA', 'DA', 'DA', 'DA', 'DA', 'DA', 'DA3',
       'DT5', 'DT', 'DT', 'DT', 'DT', 'DT', 'DT', 'DT', 'DT', 'DT3'],
      dtype=object)
>>> dmol.residues.number
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20], dtype=uint16)
>>> dmol["DA2"]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-33-840ade3ef100> in <module>
----> 1 dmol["DA2"]

~/projects/ProLIF/prolif/molecule.py in __getitem__(self, key)
    118 
    119     def __getitem__(self, key):
--> 120         return self.residues[key]
    121 
    122     def __repr__(self): # pragma: no cover

~/projects/ProLIF/prolif/residue.py in __getitem__(self, key)
    181         elif isinstance(key, str):
    182             key = ResidueId.from_string(key)
--> 183             return self.data[key]
    184         elif isinstance(key, ResidueId):
    185             return self.data[key]

KeyError: .D

The problem comes from the ResidueId.from_string method which converts DA2 to ResidueId(name=None, number=None, chain="D")

You can still index the molecule correctly by using a ResidueId directly instead of a string: dmol[plf.ResidueId("DA", 2)]. It's less user friendly but also less error prone, since some residue names actually contain numbers like DA5 which correspond to the first residue.

PS: prmtop file from the Amber tutorials

@cbouy cbouy changed the title Support for interactions with DNA Better support for interactions with DNA Jan 13, 2021
@cbouy cbouy self-assigned this Jan 13, 2021
@cbouy cbouy added the bug Something isn't working label Jan 13, 2021
@cbouy
Copy link
Member Author

cbouy commented Jan 13, 2021

Just a code snippet for DNA-DNA interactions:
Capture

cbouy added a commit that referenced this issue Jan 13, 2021
- Fixes the `ResidueId.from_string` regex to support residues codes of 1 to 3 letters (RNA, DNA and proteins)
@cbouy cbouy closed this as completed Jan 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant