Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HydrogenBondAnalysis when no atom pair is with 3A #1325

Closed
wants to merge 12 commits into from
Closed

Fix HydrogenBondAnalysis when no atom pair is with 3A #1325

wants to merge 12 commits into from

Conversation

xiki-tempula
Copy link
Contributor

@xiki-tempula xiki-tempula commented Apr 28, 2017

When there is no atom pair between the two selections of HydrogenBondAnalysis that are within 3A, HydrogenBondAnalysis throws an exception instead of giving a empty list.
Code to reproduce the error

import MDAnalysis as mda
import MDAnalysis.analysis.hbonds

text = '''TITLE     Two atoms far away
CRYST1   52.763   52.763   52.763  90.00  90.00  90.00 P 1           1
MODEL         1
ATOM      1  N   LEU     1      32.310  13.778  14.372  1.00  0.00      SYST N 0
ATOM      2  OW  SOL     2       3.024   4.456   4.147  1.00  0.00      SYST H 0'''
with open('temp.pdb', 'w') as f:
    f.write(text)
u = mda.Universe('temp.pdb')
h = MDAnalysis.analysis.hbonds.HydrogenBondAnalysis(u, 'resname SOL', 'protein')
h.run()

Gives AttributeError: 'list' object has no attribute 'select_atoms'

After analysing the code, it seems that there is no check after filtering to make sure that a selection of atom can be done.

This PR fixed this problem.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@orbeckst
Copy link
Member

@xiki-tempula thanks for report and fix.

You probably need to rebase your branch against develop and force push.

I also don't understand why travis did not run; there should be a proper unit test run. Hopefully that should show up after you push again.

As a minor thing: If you find a bug, raise an issue and describe the bug (as you did in the first part of your report). Then open a PR that references the issue. The reason is that sometimes PRs get thrown away or changed. Then you are also loosing a good reference point. It's generally considered good practice to separate the issue report from the PR (or PRs). No big deal, but for the future.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, especially that it comes with a test case. Revisions:

  • I think we need the same check for _s1: can you comment and if necessary implement, too?
  • suggestions for code clean-up

Btw, instead of bundling a mini-file you can probably use the new "stream reading" capability

import MDAnalysis as mda
import StringIO

pdb = '''TITLE     Two atoms far away
CRYST1   52.763   52.763   52.763  90.00  90.00  90.00 P 1           1
MODEL         1
ATOM      1  N   LEU     1      32.310  13.778  14.372  1.00  0.00      SYST N 0
ATOM      2  OW  SOL     2       3.024   4.456   4.147  1.00  0.00      SYST H 0'''

u = mda.Universe(StringIO.StringIO(pdb), format="pdb")

and then you don't need to store a tiny file.


* 0.16.1

Enhancements
* Universe now works with StringIO objects for topologies and trajectories.
* Residues with the same residue ids are not merged by default now
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are lots of pieces that should not show up in your PR, they come from elsewhere. Something went wrong when you merged. Rebase against develop.

self._s2_donors_h[i] = tmp
self.logger_debug("Selection 2 donors: {0:d}".format(len(self._s2_donors)))
self.logger_debug("Selection 2 donor hydrogens: {0:d}".format(len(self._s2_donors_h)))
if self._s2:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need a similar check for _s1?

You can probably re-use your test case by just switching selections.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, it would be cleaner to just return here instead of doing a big if block. If you re-order the code so that all data structures are initialized and then use the if not self._s2 where the logger is already reporting this problem and return from there.

         self._s2_donors = {}
         self._s2_donors_h = {}
         self._s2_acceptors = {}

         if not self._s2:
            # nothing to do here, just keep empty data structures
            logger.warn('Selection 2 "{0}" did not select any atoms.'.format(
                 str(self.selection2)[:80]))
            return

        if self.selection1_type in ('donor', 'both'):
            self._s2_acceptors = self._s2.select_atoms(...)
            ...

This is reasonably clean (and the diff will also be smaller).

@xiki-tempula
Copy link
Contributor Author

xiki-tempula commented Apr 29, 2017

@orbeckst Thank you for your advice.
I have checked and it seems that this bug occurred because there was no check after filtering.
There are checks at the beginning which dealt with an empty selection. Then the script compares the pairwise distance between s1 and s2 and then get rid of the atoms in s2 which are too far away from s1. Thus, s2 was changed while s1 was not. So if the s2 becomes empty after filtering, it escaped the check and caused error.
This can be check using
h = MDAnalysis.analysis.hbonds.HydrogenBondAnalysis(u, 'protein and not protein', 'resname SOL')

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification regarding testing s1 and s2.

Minor requests:

  • clean up (see inline)
  • rebase against develop (do you know how to do this?)

@@ -31,8 +31,12 @@

import itertools
import warnings
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?? I think it can be removed, can't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comment. There is no StringIO in py3 so I was trying to make it compatible with py3. I still need some time to figure out how this stream reading works.

Copy link
Member

@orbeckst orbeckst Apr 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't think about this. Then we should probably use six.StringIO: something like

from six import StringIO

which replaces

from StringIO import StringIO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. It seems that six is more powerful than I thought after spending an hour playing with py3.io.StringIO.

@@ -87,6 +91,18 @@ def test_generate_table(self):
assert_array_equal(h.table.donor_resid, self.values['donor_resid'])
assert_array_equal(h.table.acceptor_resnm, self.values['acceptor_resnm'])

def test_atoms_too_far(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor thing: Make this a static method

@staticmethod
def test_atoms_too_far():
   ...

@@ -0,0 +1,5 @@
TITLE Two atoms far away
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove (you're using the inline test)

@@ -144,6 +144,7 @@
"MMTF", "MMTF_gz",
"ALIGN_BOUND", # two component bound system
"ALIGN_UNBOUND", # two component unbound system
"HBond_atoms_far", # Universe with two atoms far away
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove (you're using the inline test)

@@ -221,6 +222,7 @@
XTC_sub_sol = resource_filename(__name__, 'data/cobrotoxin.xtc')
PDB_sub_sol = resource_filename(__name__, 'data/cobrotoxin.pdb')
PDB_xlserial = resource_filename(__name__, 'data/xl_serial.pdb')
HBond_atoms_far = resource_filename(__name__, 'data/HBond_atoms_far.pdb')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove (you're using the inline test)

@@ -88,6 +92,18 @@ def test_generate_table(self):
assert_array_equal(h.table.acceptor_resnm, self.values['acceptor_resnm'])

@staticmethod
def test_atoms_too_far(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove self from the argument list of the static method

@orbeckst
Copy link
Member

I resolved the conflict in CHANGELOG using GitHub's friendly Resolve button (I hadn't done this before). Apparently that merged develop into your branch. Normally that is not what I would do but this should be ok. Remember to git pull before committing to this branch.

When it comes to merging I will probably do any squashing manually from the command line, just to make sure that the merge of develop into your branch did not mess up anything.

@orbeckst orbeckst self-assigned this Apr 29, 2017
@@ -103,7 +104,6 @@ def test_atoms_too_far(self):
h.run(verbose=False)
assert_equal(h.timeseries, [[]])

@staticmethod
def test_true_traj():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This @staticmethod has to stay.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • use six.StringIO
  • fixes as in comments


from MDAnalysisTests.datafiles import PDB_helix, GRO, XTC
from MDAnalysisTests.datafiles import PDB_helix, GRO, XTC, HBond_atoms_far
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove HBond_atoms_far

h = MDAnalysis.analysis.hbonds.HydrogenBondAnalysis(u, 'resname SOL', 'protein')
h.run(verbose=False)
assert_equal(h.timeseries, [[]])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There has to be a @staticmethod in front of the test_true_traj() function; Travis fails https://travis-ci.org/MDAnalysis/mdanalysis/jobs/227170225 , too, and QC flags it as a critical issue.

@@ -31,8 +31,12 @@

import itertools
import warnings
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use six.StringIO (as discussed in previous comment)

@xiki-tempula
Copy link
Contributor Author

@orbeckst Thank you for your advice. I guess I have learnt about decorators today.

@orbeckst
Copy link
Member

I am rebasing/squashing/merging this manually.

orbeckst pushed a commit that referenced this pull request Apr 30, 2017
* add check that s2 selection is empty
* add test
* add change log entry
@orbeckst
Copy link
Member

See PR #1327 (I needed to make a new PR for the rebasing/history rewriting).

@orbeckst orbeckst closed this Apr 30, 2017
@xiki-tempula
Copy link
Contributor Author

@orbeckst I have approved it. Thank you for all the help and advice.

@orbeckst
Copy link
Member

Thanks for fixing a bug!

We'll have to wait for another coredev to approve #1327 – I can't do it because I submitted the PR.

richardjgowers pushed a commit that referenced this pull request May 1, 2017
* add check that s2 selection is empty
* add test
* add change log entry
richardjgowers pushed a commit that referenced this pull request May 1, 2017
* add check that s2 selection is empty
* add test
* add change log entry
@xiki-tempula xiki-tempula deleted the h_bond branch May 16, 2017 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants