Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure 'HydrogenBondAnalysis' returns the correct distances when using the 'between' keyword #4092

Merged
merged 7 commits into from
Apr 27, 2023

Conversation

p-j-smith
Copy link
Member

@p-j-smith p-j-smith commented Mar 26, 2023

Fixes #4091

Changes made in this Pull Request:

  • if the between keyword is used in HydrogenBondAnalysis, filter the donor-acceptor distance array to remove hydrogen bonds formed by atoms not in the between selection
  • added tests to check the distances are correct when using between

Previously, the donor, hydrogen, and acceptor atom groups were filtered inside the HydrogenBondAnalysis._filter_atoms method. I've changed the _filter_atoms method to instead return a mask that can be used for filtering the atom groups as well as the distance array. This seemed nicer than having to also pass in the distances to _filter_atoms, but let me know if you'd prefer it done differently.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

📚 Documentation preview 📚: https://readthedocs-preview--4092.org.readthedocs.build/en/4092/

@codecov
Copy link

codecov bot commented Mar 26, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: -1.65 ⚠️

Comparison is base (3ebd5ec) 93.59% compared to head (270f950) 91.94%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4092      +/-   ##
===========================================
- Coverage    93.59%   91.94%   -1.65%     
===========================================
  Files          192      192              
  Lines        25134    25138       +4     
  Branches      4056     4051       -5     
===========================================
- Hits         23524    23114     -410     
- Misses        1092     1526     +434     
+ Partials       518      498      -20     
Impacted Files Coverage Δ
...DAnalysis/analysis/hydrogenbonds/hbond_analysis.py 97.20% <100.00%> (+0.06%) ⬆️

... and 23 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@p-j-smith p-j-smith marked this pull request as ready for review March 27, 2023 09:01
@p-j-smith p-j-smith force-pushed the fix/hbonds-between branch from e8a9d93 to 2516c04 Compare March 29, 2023 15:42
Comment on lines 736 to +741
if self.between_ags is not None:
tmp_donors, tmp_hydrogens, tmp_acceptors = \
self._filter_atoms(tmp_donors, tmp_hydrogens, tmp_acceptors)
between_mask = self._filter_atoms(tmp_donors, tmp_acceptors)
tmp_donors = tmp_donors[between_mask]
tmp_hydrogens = tmp_hydrogens[between_mask]
tmp_acceptors = tmp_acceptors[between_mask]
d_a_distances = d_a_distances[between_mask]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a little out of scope, but would it make more sense to filter before the call to capped_distances?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point - that would definitely be better, but it's probably not a straightforward change and I imagine there are lots of ways to do it. One way would be to iterate over the pairs of atom selections passed to between and calculate donor-acceptor distances for each pair. So e.g. if between=[["protein", "SOL"], ["protein", "protein"]], we would calculate:

  • protein donor to SOL acceptor distances
  • protein acceptor to SOL donor distances
  • protein donor to protein acceptor distances

and concatenate them. But maybe that should be a separate pr after this fix is in?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate PR is better — I'd prefer fixing things and knowing that it works and then going for optimizing. I'd be more than happy if someone raises an issue for a performance improvement with a link to this discussion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I raised #4130 — please comment there if you have any follow-ups.

@orbeckst
Copy link
Member

orbeckst commented Apr 1, 2023

@richardjgowers can you assign the PR to yourself to shepherd it to completion or otherwise find someone to do it? This seems straightforward enough (at least the original fix) so I wouldn't want the patch to sit here forever, just because it slipped through. Thanks!

@p-j-smith p-j-smith force-pushed the fix/hbonds-between branch from 2516c04 to c3607fd Compare April 2, 2023 08:26
@github-actions
Copy link

github-actions bot commented Apr 2, 2023

Linter Bot Results:

Hi @p-j-smith! Thanks for making this PR. We linted your code and found the following:

There are currently no issues detected! 🎉

@orbeckst orbeckst self-requested a review April 17, 2023 18:20
@orbeckst orbeckst self-assigned this Apr 27, 2023
Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @p-j-smith , sorry it took such a long time to review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HydrogenBondAnalysis 'between' keyword returns incorrect donor-acceptor distances
4 participants