Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some frames (the ones at the trajectory boundaries) are not transformed when using transformations on universes from chainreaders #4008

Closed
hejung opened this issue Jan 23, 2023 · 11 comments · Fixed by #3906
Labels
Component-Transformations On-the-fly transformations defect Format-ChainReader ChainReader virtual trajectory reader

Comments

@hejung
Copy link

hejung commented Jan 23, 2023

I am not sure if this is related (i.e. the same underlying cause) as #3343 (there it seems to happen more often than just at the trajectory boundaries, but for the DCD traj below I also have more issues than just at the traj boundaries).
Please let me know if you need any additional info.

Expected behavior

All frames in a trajectory are transformed according to the defined on-the-fly transformations.

Actual behavior

The first frame (frame 0) of all trajectories except the first trajectory one are not transformed (if my transformation depends on an atom group).
Although this seems to be True only for XTC and TRR. The behavior I get If I do the same with the DCD from the testfiles more closely resembles #3343 i.e. every second value is wrong (independent of the trajectory boundaries). See the code example below.

Code to reproduce the behavior

The below code will produce three plots, all of them should be a flat line (we substract the Center of Mass of all atoms in the trafo and plot the center of mass of all atoms). However for XTC and TRR we have one jump at frame 10 (which is the first frame of the second trajectory) and for DCD we have strange oscillations. Adding a third TRR/XTC results in another jump at frame 20. I dont see any correlation for the DCD with the trajectory boundary (at frame 98) except maybe that the oscillattions of the y-coordinate seem even stronger.

import numpy as np
import matplotlib.pyplot as plt
import MDAnalysis as mda
from MDAnalysis.tests.datafiles import PSF, DCD,  TPR, XTC, TRR

# define our two trafos, the first one works, the second one does not work on all frames
def shift_up(shift=10):
    def wrapped(ts):
        ts.positions[:, 2] += shift
        return ts
    return wrapped

def shift_by_group(ag):
    # this just shifts all atoms in the box by their center of mass,
    # i.e. afterwards the center of mass of all atoms should be [0,0,0]
    def wrapped(ts):
        shift = ag.center_of_mass()
        ts.positions[:] -= shift
        return ts
    return wrapped

# the lists make sure we get the chainreader
universes = [mda.Universe(PSF, [DCD, DCD]), mda.Universe(TPR, [XTC, XTC]), mda.Universe(TPR, [TRR, TRR])]
traj_formats = ["DCD", "XTC", "TRR"]
fig, axs = plt.subplots(3)

for ax, u, name in zip(axs, universes, traj_formats):
    com = np.empty((len(u.trajectory), 3))
    ag = u.atoms
    trafos = (shift_by_group(ag),)
    u.trajectory.add_transformations(*trafos)
    for i, ts in enumerate(u.trajectory):
         com[i] = ag.center_of_mass()
    ax.plot(com, 'o-')
    ax.set_title(name)
ax.set_xlabel("frames")
fig.tight_layout()

Current version of MDAnalysis

  • Which version are you using? 2.4.2
  • Which version of Python (python -V)? Python 3.9.7
  • Which operating system? linux

EDIT: nicer plot produced.

@hmacdope
Copy link
Member

hmacdope commented Feb 9, 2023

For context there are two in progress fixes / semi-fixes for this #3906 and #3849. We should probably push on these at some point as this is a genuine bug IMO.

@hejung
Copy link
Author

hejung commented Feb 9, 2023

I agree that this is a (serious) bug. A bit more context from my side to explain why I think so:
I think a common workflow when using gromacs is to use mdrun -noappend and have a trajectory span over multiple parts (HPC jobs).
The fact that the first frame of the trajectories is not unwrapped means that I can not/should not use MDAnalysis to calculate anything on my trajectory without using at least trjcat first (which means I have to duplicate the data on disk)...and if I have to call external programs in a script I can just also run trjconv to get the trajectory properly wrapped (making the nice features of on-the-fly transformations essentially useless for me).

@orbeckst orbeckst added Component-Transformations On-the-fly transformations Format-ChainReader ChainReader virtual trajectory reader defect labels Feb 13, 2023
@orbeckst
Copy link
Member

I can reproduce the error with your code (see below).

Regarding the problem with DCD: the warning

mdanalysis/package/MDAnalysis/coordinates/DCD.py:165: DeprecationWarning: DCDReader currently makes independent timesteps by copying self.ts while other readers update self.ts inplace. This behavior will be changed in 3.0 to be the same as other readers. Read more at #3889 to learn if this change in behavior might affect you.
warnings.warn("DCDReader currently makes independent timesteps"

may indicate that due to the above mentioned quirk in the DCDReader there may be issues with transformations and DCDs — the DCD trajectory is fitted/centered on part of the protein so the relatively small movements might just indicate that the transformation was effectively ignored (!). This needs to be looked at!

chainfail

@hmacdope
Copy link
Member

@orbeckst My understanding is that ChainReader is currently broken for all formats.

@orbeckst
Copy link
Member

@hmacdope can you be more precise with "broken"? Do you mean "broken with transformations" or "does not concatenate trajectories correctly and cannot be used"?

@hmacdope
Copy link
Member

@hmacdope can you be more precise with "broken"? Do you mean "broken with transformations" or "does not concatenate trajectories correctly and cannot be used"?

Sorry should have been clearer, broken with transformations in that transformations are not applied on the boundary frames.

@orbeckst
Copy link
Member

This issue looks like a duplicate of #3657 (which seems to be closely related to #3343 as you mentioned @hejung ).

@jaclark5
Copy link
Contributor

jaclark5 commented Feb 16, 2023

I think this is the PR you are thinking of: #3834

The issue is for transformations that require the ts attribute where for multiple files, the new ts object isn't updated until after the transformation. There's a chicken and the egg issue that is easily fixed by checking that the ChainReader.ts id matched the input id(ts). As done in #3834

#3343 is an independent issue/fix with no relevance here. If multiple ts objects are used in the ChainReader, one may end with a frame that is identical to the first frame of the next trajectory. As of now the ChainReader only checks for this and "stitches" them together for two file types (e.g., XTC). I wanted this feature for lammps but it makes the if statement in the ChainReader messy. So I proposed an additional method be added to the Base class that calls the ts.frame to keep the if statement in the ChainReader relatively unchanged. This way I was able to overwrite that method for the LAMMPS file type to use ts.data["step"] instead and make everything cleaner. In theory with this change all file types should now be able to be stitched together, but since I'm not interested in testing and making tests for all of those, I left the file type error check and added a note that a user can easily test and add a new file type.

@richardjgowers
Copy link
Member

the code in PR #3906 gives this graph, which I think looks better

index

@orbeckst
Copy link
Member

This does look much better, both for XTC/TRR and DCD.

@hmacdope
Copy link
Member

I say push forward with #3906 :)

richardjgowers added a commit that referenced this issue May 25, 2023
richardjgowers added a commit that referenced this issue May 26, 2023
* make ChainReader subclass ReaderBase

* deal with kwargs in ChainReader properly

* add test for issue #4008

* stupid fucking linters

* changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component-Transformations On-the-fly transformations defect Format-ChainReader ChainReader virtual trajectory reader
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants