Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes chi1_selections (#4108) #4109

Merged
merged 5 commits into from
Apr 6, 2023

Conversation

DanielJamesEvans
Copy link
Contributor

@DanielJamesEvans DanielJamesEvans commented Apr 1, 2023

Fixes #4108

Changes made in this Pull Request:

  • In chi1_selections, I made the following changes:
    • Changed the default value of the function argument cg_name
    • Redefined the internal variable keep
    • Redefined the internal variable ags

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

📚 Documentation preview 📚: https://readthedocs-preview--4109.org.readthedocs.build/en/4109/

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello there first time contributor! Welcome to the MDAnalysis community! We ask that all contributors abide by our Code of Conduct and that first time contributors introduce themselves on the developer mailing list so we can get to know you. You can learn more about participating here. Please also add yourself to package/AUTHORS as part of this PR.

@codecov
Copy link

codecov bot commented Apr 1, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (bdb1352) 93.59% compared to head (b6ef5b3) 93.59%.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #4109   +/-   ##
========================================
  Coverage    93.59%   93.59%           
========================================
  Files          192      192           
  Lines        25133    25134    +1     
  Branches      4056     4056           
========================================
+ Hits         23523    23524    +1     
  Misses        1092     1092           
  Partials       518      518           
Impacted Files Coverage Δ
package/MDAnalysis/core/topologyattrs.py 96.13% <100.00%> (+<0.01%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix --- looks solid.

  • please introduce yourself on the developer list (mention your PR) if you haven't done so already
  • add a test that check for the new functionality (eg use what you mentioned in your issue report) --- find the current tests for chi1_selections() and add it there

@orbeckst orbeckst self-assigned this Apr 1, 2023
@github-actions
Copy link

github-actions bot commented Apr 3, 2023

Linter Bot Results:

Hi @DanielJamesEvans! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location Outcome
main package ⚠️ Possible failure
testsuite ✅ Passed

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/4624671533/jobs/8179729699


Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

@DanielJamesEvans
Copy link
Contributor Author

Thanks for the reply! I added a test for the new functionality. When I pushed this, the CI checks for ubuntu-latest, 3.8 returned an error. I checked the logs; I'm not sure what happened but it seems to be an error related to CodeCov itself rather than the MDAnalysis update. The error is this: Codecov: Failed to properly upload: The process '/home/runner/work/_actions/codecov/codecov-action/v3/dist/codecov' failed with exit code 255. Googling it led me to this issue report on the codecov repo which doesn't have a conclusive answer. Do you have any advice?

@IAlibay
Copy link
Member

IAlibay commented Apr 3, 2023

Thanks for the reply! I added a test for the new functionality. When I pushed this, the CI checks for ubuntu-latest, 3.8 returned an error. I checked the logs; I'm not sure what happened but it seems to be an error related to CodeCov itself rather than the MDAnalysis update. The error is this: Codecov: Failed to properly upload: The process '/home/runner/work/_actions/codecov/codecov-action/v3/dist/codecov' failed with exit code 255. Googling it led me to this issue report on the codecov repo which doesn't have a conclusive answer. Do you have any advice?

Temporary server errors with codecov, I've restarted the job.

@orbeckst
Copy link
Member

orbeckst commented Apr 3, 2023

Codecov is flaky and sometimes fails. That's not your fault so just point out that the failure is due to codecov and don't worry about it.

I'll kick the runner to rerun, maybe it uploads a bit later. @IAlibay beat me to it.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test looks good but I have a question about a potential performance regression, please see comments.

@@ -1266,13 +1266,13 @@ def chi1_selections(residues, n_name='N', ca_name='CA', cb_name='CB',
"""
results = np.array([None]*len(residues))
names = [n_name, ca_name, cb_name, cg_name]
keep = [all(sum(r.atoms.names == n) == 1 for n in names)
keep = [all(len(r.atoms.select_atoms(f"name {n}")) == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change to using a selection instead of keeping the faster array construct?

Unless there's a good reason, change back. You can time the two approaches and I am sure you'll find that select_atoms is substantially slower.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code fails when cg_name includes multiple names. E.g. if cg_name='CG CG1 OG OG1 SG' then the old code checks whether each atom's name is equal to the entire string 'CG CG1 OG OG1 SG'. This always returns False. So changing this line is a key part of fixing the bug.

I based my fix on chi1_selection, which also uses atom selection. But I agree that this is slow. I wrote an alternate version that avoids atom selection. In my testing, the newest code takes 18 us while the atom selection code takes 147 us.

Should I submit a PR to change chi1_selection to use the faster procedure? For now, I'm just going to push a version with the newest chi1_selections.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation.

Please push an update for your chi1_selections with faster code.

Let us have a look at it. We can then decide if we want to update chi1_selection right away or in a separate PR.

package/MDAnalysis/core/topologyattrs.py Outdated Show resolved Hide resolved
@@ -1266,13 +1266,13 @@ def chi1_selections(residues, n_name='N', ca_name='CA', cb_name='CB',
"""
results = np.array([None]*len(residues))
names = [n_name, ca_name, cb_name, cg_name]
keep = [all(sum(r.atoms.names == n) == 1 for n in names)
for r in residues]
keep = [all(sum(np.in1d(r.atoms.names, n.split())) == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your updated code.

@IAlibay @richardjgowers are we ok with also updating chi1_selection() with the same solution, which avoids select_atoms()? We could do a separate PR but this seems unnecessary bureaucracy to me, unless we want a cleaner separation of work in CHANGELOG.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context #4109 (comment)

I based my fix on chi1_selection, which also uses atom selection. But I agree that this is slow. I wrote an alternate version that avoids atom selection. In my testing, the newest code takes 18 us while the atom selection code takes 147 us.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about the CHANGELOG; I had a couple ideas:

  1. List the changes to both chi1_selections and chi1_selection in a single entry in the Fixes section. Maybe something like this: "Fix chi1_selections incorrectly returning None (Issue 4108) and chi_selection running unnecessarily slowly".
  2. List the change to chi1_selections as a Fix, and the change to chi1_selection as an enhancement. Maybe "Fix chi1_selections incorrectly returning None (Issue 4108)" in Fixes, and "Improved speed of chi1_selection" in Enhancements.

I think I favor option (2). But I don't know the history or culture behind this project's CHANGELOG, so I don't have a strong opinion. And it's definitely possible there's a better option that I haven't thought of!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(2) seems better to me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DanielJamesEvans I haven't heard any opposition to you updating chi1_selection() in this PR. I encourage you to add it to your PR. Once you've done so, please ping me so that I can review again.

Obviously, anyone else is also welcome to review and if dissenting opinions are voiced in other reviews we can address them then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! I added the chi1_selection() update to my PR.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor changes, then it's good to go from my end.

package/CHANGELOG Outdated Show resolved Hide resolved
package/CHANGELOG Outdated Show resolved Hide resolved
Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, looks all good to me now.

I’ll wait for a day until merging so that others have a chance to comment.

@orbeckst orbeckst merged commit fd978d2 into MDAnalysis:develop Apr 6, 2023
@orbeckst
Copy link
Member

orbeckst commented Apr 6, 2023

Congratulations on your 1st merged PR 🎉 , @DanielJamesEvans !

@DanielJamesEvans
Copy link
Contributor Author

Thanks a lot! And thanks for the helpful code review! I look forward to continuing to use (and hopefully occasionally contributing to) this project.

@DanielJamesEvans DanielJamesEvans deleted the chi1_bug_fix branch April 6, 2023 22:24
orbeckst added a commit that referenced this pull request Dec 21, 2023
Update of AUTHORS and CHANGELOG with inferred author contributions.

*  Removed duplicate mattwthompson in 0.20.0 changelog entry.: mattwthompson was placed twice by accident, this removes this duplication.

* Addition of missing authors.

   An retrospective analysis of the authors found via `git shortlog -s -n --email --branches="develop"` found several commits by authors which were not present in the `AUTHORS.md` file.

    - Zhenbo Li: commited via PR: Started tests for gnm. #803 and Make Travis run tests on OSX. #771,
    - Jenna M. Swarthout via PR Update CoC according to suggestions from current CoC committee #4289 and Point to new reporting form link (owned by [email protected]) #4298,
    - Bradley Dice via PR   Fix GSD Windows compatibility #2384 ,
    - David Minh via PR #2668

   There seemed to be no indications in the above mentioned PRs that the author did not want to be in the authors file, it looks like it was just forgotten.

* Addition of missing entries from the changelog

   Continued cross referencing of the git shortlog output but also accounting for the changelog identified several individuals that had not been included in the changelog entries for the release they contributed under. They were added to the relevant entry of the changelog based on the merge commit date.

   Please note that for Tone Bengsten, we a) had no github handle (so they were assigned @tbengtsen), and b) no specific commit. Given we know that this individual was including alongside the encore merge, they were assigned to the 0.16.0 release.

* Update CHANGELOG
* PR #1218
* PR #1284 and #1408
* PR #4109
* based on git history
* PRs #803 and #771 (author addition, changelog addition)
* PR #2255 and #2221
* PR #1225
* PR #4289 and #4298
* PR #4031
* PR #4085
* PR #3635
* PR #2356
* PR #2559
* No GH handle - Encore author addition @tbengtsen
* PR #4184
* PR #2614
* PR #2219
* PR #2384
* PR #2668
* Add missing entry for Jenna

Thanks to @fiona-naughton for helping out with digging into this data :)

Co-authored-by: Fiona Naughton <[email protected]>
Co-authored-by: Oliver Beckstein <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

chi1_selections fails for residues with CG1
3 participants