-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emit warnings for empty Ag and user grid in DensityAnalysis #3091
Conversation
Returns empty residue group for _get_prev_residues_by_resid and _get_next_residues_by_resid
Changed authors, changelog to add my name
Tests topologyattrs included
Changed the test to make it a function instead of a class
File- test_topologyattrs.py
Displays an error message and returns
Hello @aditya-kamath! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2021-03-03 16:43:18 UTC |
Codecov Report
@@ Coverage Diff @@
## develop #3091 +/- ##
===========================================
- Coverage 93.14% 92.67% -0.47%
===========================================
Files 171 171
Lines 22750 22717 -33
Branches 3218 3209 -9
===========================================
- Hits 21191 21054 -137
- Misses 1502 1615 +113
+ Partials 57 48 -9
Continue to review full report at Codecov.
|
Hey @aditya-kamath, would you be able to provide a test that triggers the warnings you have inserted? Also using autopep8 to fix style issues will save you a lot of pain. |
Yeah, I can write a test for it. I just hope that this fix does not interfere with other functionalities of DensityAnalysis as I don't fully understand it yet. Thanks for the heads up on autopep8! :) |
msg = ("No atoms in selection over whole trajectory") | ||
warnings.warn(msg) | ||
logger.warning(msg) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if three duplications of the same code block is enough to justify abstraction to a single i.e., function. Maybe borderline if the return
has to remain outside of the abstraction I suppose.
On a more basic level, just asking--the code will still produce the original failure/exception (is that what it was doing?) and this will simply add a warning to clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might sound very primitive but my logic was to return out of the function and warn the user of no atoms until atomgroup gets updated with atleast one atom. In the Issue, the user wanted to avoid the program crashing and would provide a dummy atom to keep the density frames generating. I tried to fix this by preventing a crash at the initializing, frame generation and conclusion. Perhaps a more efficient way is to make a function to check for no atoms and return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, my understanding of the DensityAnalysis function process may not be very accurate so I apologize for any incorrect arguments I make :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the other comments, please see below for some comments / areas which could be improved.
@@ -424,6 +436,8 @@ def _prepare(self): | |||
self.density = None | |||
|
|||
def _single_frame(self): | |||
if _checkinput(self) == 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading your replies to some of the other comments, I think it might be worth you investing a bit more time in understanding how the AnalysisBase
class works, particularly the order in which each of these class methods get called.
Have a look at MDAnalysis.analysis.base
with the aim to conceptualise what _single_frame
and _conclude
mean in terms of analyzing a Molecular Dynamics trajectory. Once you've done this, it's then worth going back to the original issue and trying to understand where the edge case is causing the code to fail and why.
Unless I'm missing something (which is always very much a possibility), doing these checks here may not be appropriate?
@@ -657,7 +673,7 @@ def __init__(self, *args, **kwargs): | |||
|
|||
parameters = kwargs.pop('parameters', {}) | |||
if (len(args) > 0 and isinstance(args[0], str) or | |||
isinstance(kwargs.get('grid', None), str)): | |||
isinstance(kwargs.get('grid', None), str)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be honest, this flake8 issue makes no sense to me, visual indentation should start at len
not at its argument.
Either way it's a minor thing so I'll let the other reviewers make a decision if they want here, but I'm willing to let it go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran this file through autopep8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks bizzare to me. I would revert this.
return '<Density ' + grid_type + ' with ' + \ | ||
str(self.grid.shape) + ' bins>' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaking lines with \
ends up introducing a PEP8 issue, it also ends up making the code harder to read. It's always worth noting that PEP8 is a guideline rather than a specific set of rules, code readability is always the end goal.
That being said, there is an easy way to make this readable and also less than 79 characters in one line by using f-strings
:
return '<Density ' + grid_type + ' with ' + \ | |
str(self.grid.shape) + ' bins>' | |
return f"<Density {grid_type} with {self.grid.shape} bins>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! Noted
if _checkinput(self) == 1: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of points to consider here;
-
Is this the right place to be putting this? Given the context of the original issue, and the need to make it so that the user defined grid still works even if there are no atoms in the frame, would placing this elsewhere be more appropriate?
-
Doing a
return
post-warning is a non-safe way to handle things. It's worth considering what the difference between a "warning" and an "error" is. If you're issuing a warning, you are telling users that non-expected behaviour is likely to occur (and why) but that the code will carry on. If you raise an error, you are telling users that something wrong/unexpected has occurred and you are aborting further code execution. In this case, if you are attempting to abort execution (as you are viareturn
), then an error (of appropriate type) should be raised instead. -
I think the way this is written probably needs to be changed, but it's worth mentioning anyways. Doing a return equality check on method return that essentially returns a
Boolean
is a non-pythonic way to handle things. If you're checking forTrue
/False
, the best way to do things is to have the method you're calling returnTrue
/False
and then just do:
if method(input):
true_condition
else:
false_condition
This reverts commit ec9c1f3.
Thanks to all the reviewers for their suggestions! I will now try to work on cleaning up the code in general and I see that I have a messy commit history. I aim to clean it up too. |
Messy commit histories don't usually matter for small fixes like these, eventually we will squash merge anyways, so it'll only show up as a single commit on MDAnalysis. Before I forget, probably something to take into account is the branch you make your changes on. I noticed that you are using the What I would suggest in the future is to instead create an appropriately named feature branch on your own fork, which you then make your changes on. See the following for a bit more information: https://userguide.mdanalysis.org/stable/contributing_code.html#creating-a-branch |
@IAlibay , the merging was exactly what I was facing issues with. Thanks for the info! |
docstring in Density.py
Hi @IAlibay, sorry, I have'nt gotten to this PR in a while! I've been busy with school work.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your hard work on the PR. Documentation still needs to be clearer.
Maybe the original author of issue #3055 (@TBGAnsell ) would also like to chime in and suggest improvements?
@@ -763,7 +783,8 @@ def convert_length(self, unit='Angstrom'): | |||
""" | |||
if unit == self.units['length']: | |||
return | |||
cvnfact = units.get_conversion_factor('length', self.units['length'], unit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are no tests for these altered lines of code then we should have them.
I am tending towards: If you touch code it's your responsibility to test it.
Even if it looks as if it's just cosmetic, it might not be. Bugs can be subtle. Either leave it or test it.
If the 'AtomGroup' instance does not contain any selection of atoms and | ||
'updating' is set to true, an empty :class: UpdatingAtomGroup is returned. | ||
If an instance of this class is passed onto DensityAnalysis without any | ||
user-defined grid, DensityAnalysis will fail because it cannot predict | ||
selection bounds for future frames in the trajectory. In such a situation, | ||
user defined box limits should be provided ensuring that the provided | ||
limits encompass all atoms in selection on future frames. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This text does not reflect what the code does, or at least that's not clear. The only change that you made is to emit warnings, right? EDIT: You made two changes: emit a warning and raise a ValueError
. Your docs need to make clear under which circumstances these warnings/errors happen and what the user needs to do to avoid them (and why).
State what the problem can be (empty atomgroup
– note write this in reST as
When the `atomgroup` is empty when :class:`DensityAnalysis` is initialized ...
so that you're referring to variables in the doc string). Then state what happens (warning emitted). Finally explain what the user should be doing (i.e., repeat what the warning says but elaborate). Imagine that you're the user of the function: What do you need to know when you
- want to run DensityAnalysis for the first time
- when you run into an error and think "maybe the docs tell me what to do because this error message was very confusing"
Rewrite with the above in mind.
The sentence
"If the 'AtomGroup' instance does not contain any selection of atoms and 'updating' is set to true, an empty :class: UpdatingAtomGroup is returned."
is confusing because the problem will occur as soon as atomgroup
is empty, regardless of being updating or not. It's just that the situation is more likely to happen with an UpdatingAtomGroup
.
Maybe @TBGAnsell has some suggestions what they would have liked to see (see #3055)? Comments welcome!
For exceptions that are raised and that need explanation, add a section (after Returns)
Raises
------
ValueError: raised when ....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've given it some thought and updated the Docstring. I tried to keep it as simple as possible so that a user will know what to do! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, the message here seems clear to me. One further suggestion could be to output the frames in which the density analysis fails, similar to the frames reported when H-bonds are not detected in PropkaTraj when skip_failure=True. This could be useful for an initial quick scan to see eg: how much equilibration time is needed for the ligand to reach the site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, maybe I'll include the frame number in the warning message!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning will only get called once at construction, so it won't do what @TBGAnsell is looking for (as far as I understand).
That being said, whilst I'm not against the idea of including the frame number in the warning / error, I am somewhat opposed to checking every frame for the presence of atoms in the selection. My view is twofold:
- An empty UpdatingAtomGroup at a given frame isn't necessarily a failure (unlike for PropkaTraj where we know that it's a failure [note: skip_failure captures a non-specific propka 3.x failure, not just solely the failure to detect any specific thing]).
- Since we aren't currently checking for it, it will add extra checking code that probablyl won't be cheap. In my opinion, if folks want to do a quick scan it's much cheaper/cleaner to just do a quick loop over your updatingatomgroup.
That being said, I'm willing to default to @orbeckst's opinion on the matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @TBGAnsell for your comments — much appreciated!
I agree with @IAlibay reservations about checking every frame because of likely performance reasons. If a "scanning" feature is desired then this can be a new issue (please raise it and we discuss there). Let's keep this one focused on the original problem.
smin = self._gridcenter | ||
smax = self._gridcenter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are smin
and smax
assigned the grid center coordinates?
Please explain the logic here. What consequences does it have downstream? Does the code fail further downstream if the user has not supplied a proper grid selection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that was done because _set_user_grid
wanted a preassigned range from the coordinates of AtomGroup
in selection. But there is no AtomGroup
in this case, so I initialize a zero range grid at the grid center and take the range from user-defined grid values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you do
# need to be set for _set_user_grid() but will be
# overriden by user-defined values
smin = smax = None
to make clearer what's happening? If None
is not possible (even though that would be the best value to indicate that we are not using these values), just setting to 0 or [0, 0, 0]
would still be cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will trigger this warning in set_user_grid if I do that.
if any(smin < umin) or any(smax > umax):
msg = ("Atom selection does not fit grid --- "
"you may want to define a larger box")
warnings.warn(msg)
logger.warning(msg)
The check is meant for regular situations to compare the atom selection with user-defined grid.
In the case of no atoms in selection, I just want it to pass through. So setting smin and smax at the gridcenter will always pass the comparison with a user defined box.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document in a comment, please.
package/CHANGELOG
Outdated
@@ -21,6 +21,8 @@ The rules for this file: | |||
* 2.0.0 | |||
|
|||
Fixes | |||
* UserWarning displayed when empty atomgroup is given |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you also raise a ValueError
– please document
Reverts changes made by autoPEP8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looking better, but please address my comments.
"""Function to prepare Density | ||
|
||
Raises | ||
------ | ||
ValueError | ||
if AtomGroup is empty and no user defined grid is provided | ||
UserWarning | ||
if AtomGroup is empty and a user defined grid is provided | ||
|
||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These docs are invisible in the documentation. Add them to the class doc string.
(You can check the docs for your PR https://mdanalysis--3091.org.readthedocs.build/en/3091/ : check that they show up, namely in https://mdanalysis--3091.org.readthedocs.build/en/3091/documentation_pages/analysis/density.html#MDAnalysis.analysis.density.DensityAnalysis )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I forgot to add a 'r' symbol which might be what triggers building the docs. My next commit will include this! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @orbeckst, as I am just starting out with this kind of development. I am not sure how the webpage docs are built with the repository code? Perhaps there is a package I should refer to see how it's built so that I can ensure this comment is included in the webpage as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aditya-kamath you can check updates to the PR in the ReadTheDocs build. Otherwise, you could cd package/doc/sphinx
, type make html
, and your output will be in ../html
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think this issue is related to sphinx docs. Hopefully my next commit will compile this to the docs build.
a `Density`. Although, it should be ensured that the provided | ||
grid limits encompass atoms to be selected on all trajectory frames. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a `Density`. Although, it should be ensured that the provided | |
grid limits encompass atoms to be selected on all trajectory frames. | |
a `Density` although it remains the user's responsibility to ensure that the | |
provided grid limits encompass atoms to be selected in all trajectory frames. |
@@ -229,6 +229,11 @@ class DensityAnalysis(AnalysisBase): | |||
are within 4 Å of the protein heavy atoms) then create an | |||
:class:`~MDAnalysis.core.groups.UpdatingAtomGroup` (see Examples). | |||
|
|||
DensityAnalysis will fail when the `AtomGroup` instance does not contain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DensityAnalysis will fail when the `AtomGroup` instance does not contain | |
:class:`DensityAnalysis` will fail when the :class:`AtomGroup` instance does not contain |
package/CHANGELOG
Outdated
* UserWarning/ValueError displayed when empty atomgroup is given | ||
to DensityAnalysis (Issue #3055) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A ValueError
is not displayed but raised. Make clearer when the warning is shown and when ValueError is raised.
@aditya-kamath the doc build is failing, please have a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see comments and fix doc build. Thanks!
@@ -229,6 +229,12 @@ class DensityAnalysis(AnalysisBase): | |||
are within 4 Å of the protein heavy atoms) then create an | |||
:class:`~MDAnalysis.core.groups.UpdatingAtomGroup` (see Examples). | |||
|
|||
:class:`DensityAnalysis` will fail when the :class:`AtomGroup` instance | |||
does not contain any selection of atoms, even when `updating` is set to | |||
True. In such a situation, user defined box limits can be provided to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mark up Python
True. In such a situation, user defined box limits can be provided to | |
``True``. In such a situation, user defined box limits can be provided to |
@@ -107,7 +107,7 @@ | |||
After the analysis (see the :meth:`~DensityAnalysis.run` method), the resulting density is | |||
stored in the :attr:`density` attribute as a :class:`Density` instance. | |||
|
|||
.. automethod:: _set_user_grid | |||
.. automethod:: _prepare, _set_user_grid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove _prepare
: this should not be documented as it's an internal method. Move the docs into the class doc string. This is the standard way how we document Analysis classes.
_set_user_grid()
is a special case because it does all kinds of things and one might need to know about it.
Raises | ||
------ | ||
ValueError | ||
if AtomGroup is empty and no user defined grid is provided | ||
UserWarning | ||
if AtomGroup is empty and a user defined grid is provided | ||
|
||
.. versionadded:: 1.0.0 | ||
.. versionchanged:: 2.0.0 | ||
Now a method of :class:`DensityAnalysis`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this doc string into the class doc string. See comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, remove the Now a method of :class:DensityAnalysis.
comment as it does not make sense. _prepare
is always in a AnalysisClass.
Your change is for 2.0.0 so do not add any versionadded for 1.0.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @orbeckst for your patience and advice! My next commit will make these changes..
smin = self._gridcenter | ||
smax = self._gridcenter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment explaining the hacky assignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! :)
smin = self._gridcenter | ||
smax = self._gridcenter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document in a comment, please.
@@ -827,4 +865,4 @@ def __repr__(self): | |||
grid_type = 'density' | |||
else: | |||
grid_type = 'histogram' | |||
return '<Density ' + grid_type + ' with ' + str(self.grid.shape) + ' bins>' | |||
return f"<Density {grid_type} with {self.grid.shape} bins>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, perhaps I'll have to think about a test for this. I'd leave it alone, but it looks like autoPEP8 felt that the line was very long!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @orbeckst, I've removed this change. However, a check seems to be failing and I'm not sure if it is a concern.(?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test failure was down to a non-related RDKIT test (it's a flaky test that fails sometimes on CI), I've restarted the jobs, hopefully they should pass this time.
Hi @orbeckst , I'm just wondering if we can extend this PR to project 5 in the GSoC proposal list? I will have a basic idea on the DensityAnalysis module after this PR and I'm thinking about getting started with understanding other aspects of densities to include in my proposal.. |
You could make a general density analysis (for RMS-fitted PBC systems) an application of the work in project 5 – that would be neat because it cannot really be done with any tools that I know. However, let's not extend this PR. It's much better to make PRs as small and self-contained as possible: easier to debug, much easier to review (and the main bottleneck is typically reviewer time/attention). You'd open a new PR for a new feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my comments.
I assume that we'll sort out the failing tests somehow, but as they are not related to your code I'll approve for now.
So we've been treating it as spurious (see: #2958), and generally re-starting CI fixes things. However, we should really get to the bottom of it before we release 2.0. The mismatch is really large :/ I'm hoping one of the RKDIT PRs that are still to be merged might fix this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm as long as CI returns green!
@hmacdope do you want to re-review/approve? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Congrats @aditya-kamath.
Sorry for review spam, unintended. |
…is#3091) - Adds ValueError if AtomGroup is empty and no user defined grid is provided - Adds UserWarning if AtomGroup is empty and a user defined grid is provided
Fixes #3055
Changes made in this Pull Request:
PR Checklist