-
Notifications
You must be signed in to change notification settings - Fork 657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ts.dimensions
object with different trajectory formats
#1575
Comments
Look very carefully at what happens on those append operations with XTC. All numpy arrays stored in the list are reassigned to the current numpy array of dimensions for XTC but not DCD. |
You have been running into a issue with references in python. Unfortunately we seem to handle references for the dimensions differently between formats. For the xtc reader The question here is if The dcd has another curious thing. The file you are reading doesn't have any box information. Therefore the unitcell should always be |
So
|
Also related (I think @kain88-de mentions this above), some Readers create a new |
@kain88-de There's nothing curious about the second DCD frame. I intentionally reassigned the first element of the unit cell to the frame number so that I could assess if the values were references or not--it was more confusing to deal with reference vs. copy if the unit cell is identical in each frame because the reassignments in the list are confounded by the fact that the unit cell doesn't change in the test file during iteration. You'll also notice that I intentionally appended the value of My take is that we should never return references like this for values a user would typically want to access. I've always used MDAnalysis in a manner that allows me to use pythonic paradigms like appending trajectory-retrieved values to a list. Regardless of what decision is made, I think we can agree that it should be consistent across all formats and probably all reasonably-accessed property types--a real power of MDAnalysis is the ability to perform reproducible analyses in a format-agnostic manner, and breaking this is why I've labelled a high priority here. I would suggest that we even try to enforce the policy with a (parametrized?) unit test that crawls through the relevant parameters for the derived readers & their properties & ensures that whatever type of object (reference / value / view) is supposed to be returned is actually returned. |
Following up from #276 (comment) We should decide clearly if
class Timestep:
@property
def dimensions(self):
return self._unitcell.copy() EDIT: I would then leave |
@tylerjereddy in general I very much agree with your #1575 (comment):
I consider timeseries = [ts.positions for ts in u.trajectory] # does NOT produce a timeseries of different positions whereas However, I would consider working with If people had very strong feelings about |
Should this still be considered a bug? What is the action that should be taken on this issue? Is it still high priority? |
This is a bug. The problem is a principal design deficiency in TimeStep that requires us to create a subclass for every format to support the current model to show the raw dimensions. I didn't have time to have a proper look into TimeStep and look into a solution that doesn't require sub-classing. It will likely be an easy fix but we have to check/change all readers for this. I think we just need to redefine what is a raw cell data. I hope we can do this without breaking old user code. |
So as @kain88-de has said, it's impossible to make I'd like to make Thoughts? |
This is actually already done for the XDR formats of GROMACS for simplicity. Personally, I would also like to get rid of sub classing timestep. It means less moving parts to get right and potentially works towards simplifying the readers to implement a read/write function that does all necessary conversions. |
Yeah exactly, fewer classes is always better. We'd just need to make sure
we never lose precision when converting etc (ie avoid 89.99999 angles)
…On Tue, 30 Jan 2018, 13:46 Max Linke, ***@***.***> wrote:
Ie shift the burden of weird box formats from Timestep to Reader/Writer.
This might actually simplify things overall.
This is actually already done for the XDR formats of GROMACS for
simplicity. Personally, I would also like to get rid of sub classing
timestep. It means less moving parts to get right and potentially works
towards simplifying the readers to implement a read/write function that
does all necessary conversions.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1575 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AI0jB4vcY00F-GSQ-z8Sj8yMfd5qt51wks5tPx0ngaJpZM4Or0ft>
.
|
@richardjgowers is this done now? |
Expected behaviour
The
ts.dimensions
returned object (the numpy array of values) is NOT linked back to the state of the trajectory -- the property should return a numpy array of dimensions that has no connection to the current state of trajectory iteration once thevalue
invalue = ts.dimensions
has been stored i.e., in a list.Futhermore, this behavior must be consistent between different trajectory formats.
Actual behaviour
I'm so confused because look at this behavior when tracking dimensions over time with two different formats (and for even more confusion, look at what happens to the stored values for the XTC format after iteration ends!!!)
Output:
This is incredibly confusing to me. Why on earth is the interaction behavior of the XTC returned dimension value so different from DCD returned dimension value. The latter looks sane, but XTC looks wild and has caused me hours of confusing / debugging in a much larger framework context today.
I'm using Python 3.6 and the dev branch.
The text was updated successfully, but these errors were encountered: