-
Notifications
You must be signed in to change notification settings - Fork 658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes chi1_selections (#4108) #4109
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello there first time contributor! Welcome to the MDAnalysis community! We ask that all contributors abide by our Code of Conduct and that first time contributors introduce themselves on the developer mailing list so we can get to know you. You can learn more about participating here. Please also add yourself to package/AUTHORS
as part of this PR.
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## develop #4109 +/- ##
========================================
Coverage 93.59% 93.59%
========================================
Files 192 192
Lines 25133 25134 +1
Branches 4056 4056
========================================
+ Hits 23523 23524 +1
Misses 1092 1092
Partials 518 518
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fix --- looks solid.
- please introduce yourself on the developer list (mention your PR) if you haven't done so already
- add a test that check for the new functionality (eg use what you mentioned in your issue report) --- find the current tests for chi1_selections() and add it there
Linter Bot Results:Hi @DanielJamesEvans! Thanks for making this PR. We linted your code and found the following: Some issues were found with the formatting of your code.
Please have a look at the Please note: The |
Thanks for the reply! I added a test for the new functionality. When I pushed this, the CI checks for ubuntu-latest, 3.8 returned an error. I checked the logs; I'm not sure what happened but it seems to be an error related to CodeCov itself rather than the MDAnalysis update. The error is this: |
Temporary server errors with codecov, I've restarted the job. |
Codecov is flaky and sometimes fails. That's not your fault so just point out that the failure is due to codecov and don't worry about it.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test looks good but I have a question about a potential performance regression, please see comments.
@@ -1266,13 +1266,13 @@ def chi1_selections(residues, n_name='N', ca_name='CA', cb_name='CB', | |||
""" | |||
results = np.array([None]*len(residues)) | |||
names = [n_name, ca_name, cb_name, cg_name] | |||
keep = [all(sum(r.atoms.names == n) == 1 for n in names) | |||
keep = [all(len(r.atoms.select_atoms(f"name {n}")) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you change to using a selection instead of keeping the faster array construct?
Unless there's a good reason, change back. You can time the two approaches and I am sure you'll find that select_atoms is substantially slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old code fails when cg_name
includes multiple names. E.g. if cg_name='CG CG1 OG OG1 SG'
then the old code checks whether each atom's name is equal to the entire string 'CG CG1 OG OG1 SG'
. This always returns False. So changing this line is a key part of fixing the bug.
I based my fix on chi1_selection
, which also uses atom selection. But I agree that this is slow. I wrote an alternate version that avoids atom selection. In my testing, the newest code takes 18 us while the atom selection code takes 147 us.
Should I submit a PR to change chi1_selection
to use the faster procedure? For now, I'm just going to push a version with the newest chi1_selections
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation.
Please push an update for your chi1_selections
with faster code.
Let us have a look at it. We can then decide if we want to update chi1_selection right away or in a separate PR.
@@ -1266,13 +1266,13 @@ def chi1_selections(residues, n_name='N', ca_name='CA', cb_name='CB', | |||
""" | |||
results = np.array([None]*len(residues)) | |||
names = [n_name, ca_name, cb_name, cg_name] | |||
keep = [all(sum(r.atoms.names == n) == 1 for n in names) | |||
for r in residues] | |||
keep = [all(sum(np.in1d(r.atoms.names, n.split())) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your updated code.
@IAlibay @richardjgowers are we ok with also updating chi1_selection()
with the same solution, which avoids select_atoms()
? We could do a separate PR but this seems unnecessary bureaucracy to me, unless we want a cleaner separation of work in CHANGELOG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Context #4109 (comment)
I based my fix on chi1_selection, which also uses atom selection. But I agree that this is slow. I wrote an alternate version that avoids atom selection. In my testing, the newest code takes 18 us while the atom selection code takes 147 us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about the CHANGELOG; I had a couple ideas:
- List the changes to both
chi1_selections
andchi1_selection
in a single entry in the Fixes section. Maybe something like this: "Fix chi1_selections incorrectly returning None (Issue 4108) and chi_selection running unnecessarily slowly". - List the change to
chi1_selections
as a Fix, and the change tochi1_selection
as an enhancement. Maybe "Fix chi1_selections incorrectly returning None (Issue 4108)" in Fixes, and "Improved speed of chi1_selection" in Enhancements.
I think I favor option (2). But I don't know the history or culture behind this project's CHANGELOG, so I don't have a strong opinion. And it's definitely possible there's a better option that I haven't thought of!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(2) seems better to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DanielJamesEvans I haven't heard any opposition to you updating chi1_selection()
in this PR. I encourage you to add it to your PR. Once you've done so, please ping me so that I can review again.
Obviously, anyone else is also welcome to review and if dissenting opinions are voiced in other reviews we can address them then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! I added the chi1_selection()
update to my PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minor changes, then it's good to go from my end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, looks all good to me now.
I’ll wait for a day until merging so that others have a chance to comment.
Congratulations on your 1st merged PR 🎉 , @DanielJamesEvans ! |
Thanks a lot! And thanks for the helpful code review! I look forward to continuing to use (and hopefully occasionally contributing to) this project. |
Update of AUTHORS and CHANGELOG with inferred author contributions. * Removed duplicate mattwthompson in 0.20.0 changelog entry.: mattwthompson was placed twice by accident, this removes this duplication. * Addition of missing authors. An retrospective analysis of the authors found via `git shortlog -s -n --email --branches="develop"` found several commits by authors which were not present in the `AUTHORS.md` file. - Zhenbo Li: commited via PR: Started tests for gnm. #803 and Make Travis run tests on OSX. #771, - Jenna M. Swarthout via PR Update CoC according to suggestions from current CoC committee #4289 and Point to new reporting form link (owned by [email protected]) #4298, - Bradley Dice via PR Fix GSD Windows compatibility #2384 , - David Minh via PR #2668 There seemed to be no indications in the above mentioned PRs that the author did not want to be in the authors file, it looks like it was just forgotten. * Addition of missing entries from the changelog Continued cross referencing of the git shortlog output but also accounting for the changelog identified several individuals that had not been included in the changelog entries for the release they contributed under. They were added to the relevant entry of the changelog based on the merge commit date. Please note that for Tone Bengsten, we a) had no github handle (so they were assigned @tbengtsen), and b) no specific commit. Given we know that this individual was including alongside the encore merge, they were assigned to the 0.16.0 release. * Update CHANGELOG * PR #1218 * PR #1284 and #1408 * PR #4109 * based on git history * PRs #803 and #771 (author addition, changelog addition) * PR #2255 and #2221 * PR #1225 * PR #4289 and #4298 * PR #4031 * PR #4085 * PR #3635 * PR #2356 * PR #2559 * No GH handle - Encore author addition @tbengtsen * PR #4184 * PR #2614 * PR #2219 * PR #2384 * PR #2668 * Add missing entry for Jenna Thanks to @fiona-naughton for helping out with digging into this data :) Co-authored-by: Fiona Naughton <[email protected]> Co-authored-by: Oliver Beckstein <[email protected]>
Fixes #4108
Changes made in this Pull Request:
chi1_selections
, I made the following changes:cg_name
keep
ags
PR Checklist
📚 Documentation preview 📚: https://readthedocs-preview--4109.org.readthedocs.build/en/4109/