Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two hydrogen atoms with the same atom name appear in the same residue #12

Closed
padix-key opened this issue Jul 25, 2024 · 6 comments · Fixed by #15
Closed

Two hydrogen atoms with the same atom name appear in the same residue #12

padix-key opened this issue Jul 25, 2024 · 6 comments · Fixed by #15
Labels
bug Something isn't working

Comments

@padix-key
Copy link
Member

In case a residue to be hydrogenated has an additional heavy atom or a missing heavy atom, AtomNameLibrary.generate_hydrogen_names() does not ensure unique hydrogen atom names.

Example:

import biotite.structure.info as info
import hydride

atoms = info.residue("ALA")
# Remove a heavy atom to enforce an unsual hydrogenation at its position instead
atoms = atoms[atoms.atom_name != "OXT"]

atoms = atoms[atoms.element != "H"]
hydrogenated_atoms, _ = hydride.add_hydrogen(atoms)
print(hydrogenated_atoms)

Output:

            0  ALA N      N        -0.970    0.490    1.500
            0  ALA CA     C         0.260    0.420    0.690
            0  ALA C      C        -0.090    0.020   -0.720
            0  ALA O      O        -1.060   -0.680   -0.920
            0  ALA CB     C         1.200   -0.620    1.300
            0  ALA H      H        -0.969    1.384    1.958
            0  ALA H2     H        -1.744    0.507    0.845
            0  ALA HA     H         0.741    1.396    0.720
            0  ALA H      H         0.544    0.376   -1.518    # This atom replaces `OXT` and has the duplicate name
            0  ALA HB1    H         1.482   -0.342    2.316
            0  ALA HB2    H         2.099   -0.657    0.682
            0  ALA HB3    H         0.735   -1.599    1.314

This bug should only appear rarely, as residues with missing/additional heavy atoms usually do not make sense in the first place. Still the hydrogen atom names should be unique per residue.

@padix-key padix-key added the bug Something isn't working label Jul 25, 2024
@dargen3
Copy link

dargen3 commented Aug 30, 2024

Hello,

thank you for hydride! It's nice to work with! I would like to report that I've encountered the same problem with heteroresidue. I protonated the structure 4AOC with the command:

hydride --infile 4aoc.cif --outfile 4aoc_protonated.cif -v

and it resulted in:

HETATM C C1 . A1Q B 2 1131 . 1131 A1Q B C1 ? -6.216 22.324 -7.128 1 8972
HETATM O O7 . A1Q B 2 1131 . 1131 A1Q B O7 ? -1.874 23.118 -8.327 1 8973
HETATM O O1 . A1Q B 2 1131 . 1131 A1Q B O1 ? -5.669 21.024 -6.804 1 8974
HETATM C C2 . A1Q B 2 1131 . 1131 A1Q B C2 ? -7.511 22.139 -7.944 1 8975
HETATM O O2 . A1Q B 2 1131 . 1131 A1Q B O2 ? -8.323 23.32 -7.857 1 8976
HETATM C C3 . A1Q B 2 1131 . 1131 A1Q B C3 ? -7.281 21.847 -9.428 1 8977
HETATM O O3 . A1Q B 2 1131 . 1131 A1Q B O3 ? -8.484 21.936 -10.214 1 8978
HETATM C C4 . A1Q B 2 1131 . 1131 A1Q B C4 ? -6.282 22.863 -9.946 1 8979
HETATM O O4 . A1Q B 2 1131 . 1131 A1Q B O4 ? -6.068 22.711 -11.359 1 8980
HETATM C C5 . A1Q B 2 1131 . 1131 A1Q B C5 ? -5.041 22.632 -9.113 1 8981
HETATM O O5 . A1Q B 2 1131 . 1131 A1Q B O5 ? -5.268 23.164 -7.815 1 8982
HETATM C C6 . A1Q B 2 1131 . 1131 A1Q B C6 ? -3.803 23.213 -9.74 1 8983
HETATM O O6 . A1Q B 2 1131 . 1131 A1Q B O6 ? -3.701 24.613 -9.428 1 8984
HETATM C C7 . A1Q B 2 1131 . 1131 A1Q B C7 ? -2.609 22.364 -9.287 1 8985
HETATM C C8 . A1Q B 2 1131 . 1131 A1Q B C8 ? -4.444 21.05 -6.054 1 8986
HETATM H H1 . A1Q B 2 1131 . 1131 A1Q B H1 ? -6.4262753 22.830936 -6.179991 1 8987
HETATM H H7 . A1Q B 2 1131 . 1131 A1Q B H7 ? -1.3125027 22.492907 -7.8362956 1 8988
HETATM H H2 . A1Q B 2 1131 . 1131 A1Q B H2 ? -8.026652 21.265627 -7.5301847 1 8989
HETATM H H2 . A1Q B 2 1131 . 1131 A1Q B H2 ? -8.438941 23.513788 -6.910527 1 8990
HETATM H H3 . A1Q B 2 1131 . 1131 A1Q B H3 ? -6.827121 20.853497 -9.51085 1 8991
HETATM H H3 . A1Q B 2 1131 . 1131 A1Q B H3 ? -9.228026 21.71413 -9.627479 1 8992
HETATM H H4 . A1Q B 2 1131 . 1131 A1Q B H4 ? -6.6487293 23.869776 -9.7183275 1 8993
HETATM H H4 . A1Q B 2 1131 . 1131 A1Q B H4 ? -5.9327846 23.59531 -11.741785 1 8994
HETATM H H5 . A1Q B 2 1131 . 1131 A1Q B H5 ? -4.873869 21.55034 -9.068252 1 8995
HETATM H H6 . A1Q B 2 1131 . 1131 A1Q B H6 ? -3.878452 23.07668 -10.824271 1 8996
HETATM H H6 . A1Q B 2 1131 . 1131 A1Q B H6 ? -4.5722694 25.021593 -9.572047 1 8997
HETATM H H7 . A1Q B 2 1131 . 1131 A1Q B H7 ? -1.9559722 22.1669 -10.145719 1 8998
HETATM H H7A . A1Q B 2 1131 . 1131 A1Q B H7A ? -2.9551985 21.40397 -8.8914585 1 8999
HETATM H H8 . A1Q B 2 1131 . 1131 A1Q B H8 ? -3.6677785 21.56298 -6.619452 1 9000
HETATM H H8A . A1Q B 2 1131 . 1131 A1Q B H8A ? -4.130931 20.024744 -5.859006 1 9001
HETATM H H8B . A1Q B 2 1131 . 1131 A1Q B H8B ? -4.5953755 21.559626 -5.10281 1 9002

It's one of the first proteins with heteroresidue that I protonated, so it's probably a more common problem.

@padix-key
Copy link
Member Author

padix-key commented Aug 31, 2024

I can confirm the problem, thanks. The reason is that the residue contains both C2 and O2, which both gets H2 as name assigned. So, you are probably right, that this problem may appear quite commonly. To fix this, AtomNameLibrary.get_hydrogen_names() needs to be updated to blacklist already used names.

@dargen3
Copy link

dargen3 commented Sep 9, 2024

Hello,

I would like to respectfully ask if and possibly in what timeframe a fix for this issue and also [#14] is planned? I would like to use hydride for a PDB database related project and they are asking me for a date when it will be ready. If you don't have enough time for that, I can try to help with this issue. Thank you for your reply!

@padix-key
Copy link
Member Author

I assume I will fix them within the next two weeks. Is this sufficient?

@dargen3
Copy link

dargen3 commented Sep 9, 2024

That's perfectly sufficient. Thank you!

@padix-key
Copy link
Member Author

padix-key commented Sep 23, 2024

Unfortunately I have to admit, I will not be able to finish my work on this issue this week due to a busy last week. The new ETA would be end of next week. I am sorry for the inconvenience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants