Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when loading multiple large DCD trajectories #4039

Closed
vdeshchenya opened this issue Feb 24, 2023 · 25 comments · Fixed by #4048
Closed

Error when loading multiple large DCD trajectories #4039

vdeshchenya opened this issue Feb 24, 2023 · 25 comments · Fixed by #4048

Comments

@vdeshchenya
Copy link

Expected behavior

I create a Universe with several large trajectory files (size > 2 GB) and use the transfer_to_memory function to load them.

Actual behavior

The transfer_to_memory function fails after loading 53458/117600 frames with the error "OSError: DCD seek failed with DCD error=Normal EOF".

Loading frames:  45%|████████████████████████████████████████████████████████▎                                                                   | 53458/117600 [00:07<00:09, 7050.72it/s]
Traceback (most recent call last):
  File "/Users/vladimir/test.py", line 15, in <module>
    u.transfer_to_memory(verbose=True)
  File "/opt/anaconda3/envs/work/lib/python3.9/site-packages/MDAnalysis/core/universe.py", line 626, in transfer_to_memory
    for i, ts in enumerate(ProgressBar(self.trajectory[start:stop:step],
  File "/opt/anaconda3/envs/work/lib/python3.9/site-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File "/opt/anaconda3/envs/work/lib/python3.9/site-packages/MDAnalysis/coordinates/chain.py", line 672, in __next__
    self.ts = self.active_reader[f]
  File "/opt/anaconda3/envs/work/lib/python3.9/site-packages/MDAnalysis/coordinates/base.py", line 828, in __getitem__
    return self._read_frame_with_aux(frame)
  File "/opt/anaconda3/envs/work/lib/python3.9/site-packages/MDAnalysis/coordinates/base.py", line 860, in _read_frame_with_aux
    ts = self._read_frame(frame)  # pylint: disable=assignment-from-no-return
  File "/opt/anaconda3/envs/work/lib/python3.9/site-packages/MDAnalysis/coordinates/DCD.py", line 198, in _read_frame
    self._file.seek(i)
  File "MDAnalysis/lib/formats/libdcd.pyx", line 398, in MDAnalysis.lib.formats.libdcd.DCDFile.seek
OSError: DCD seek failed with DCD error=Normal EOF

(53458 frames take up about 2 GB of memory)

Code to reproduce the behavior

import MDAnalysis as mda
from MDAnalysis.tests.datafiles import PSF, DCD

u = mda.Universe(PSF, DCD)

# creating a big trajectory (2.2 GB)
ag = u.select_atoms("all")
with mda.Writer('test.dcd', ag.n_atoms) as w:
    for _ in range(600):
        for ts in u.trajectory:
            w.write(ag)


u = mda.Universe(PSF, 'test.dcd', 'test.dcd')
u.transfer_to_memory(verbose=True)

Current version of MDAnalysis

  • Which version are you using? (run python -c "import MDAnalysis as mda; print(mda.__version__)"): 2.4.2
  • Which version of Python (python -V)?: Python 3.9.15
  • Which operating system?: MacOS Ventura 13.1 (22C65)
@hmacdope
Copy link
Member

I will confirm but this may be related to a change in DCD reading in #3882.

@IAlibay
Copy link
Member

IAlibay commented Feb 27, 2023

I don't think this is a DCD corruption issue, but instead a problem with an int overflow in the DCD code on Windows machines, or more specifically the issue outlined here: #3075

@vdeshchenya could you comment on how many frames you were expected to read from this DCD?

@hmacdope
Copy link
Member

@IAlibay IIRC did recently get this on my Linux box but will confirm.

@IAlibay
Copy link
Member

IAlibay commented Feb 27, 2023

ah I missed that OP was on macos, that's a lot more worrisome.

@orbeckst
Copy link
Member

I can reproduce on macOS 10.15.7, Python 3.10, current mda develop

In [3]: u.transfer_to_memory(verbose=True)
Loading frames:  45%|████████▏         | 53458/117600 [00:06<00:08, 7846.35it/s]
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In [3], line 1
----> 1 u.transfer_to_memory(verbose=True)
...
File ~/MDAnalysis/mdanalysis/package/MDAnalysis/coordinates/base.py:860, in ProtoReader._read_frame_with_aux(self, frame)
    858 def _read_frame_with_aux(self, frame):
    859     """Move to *frame*, updating ts with trajectory, transformations and auxiliary data."""
--> 860     ts = self._read_frame(frame)  # pylint: disable=assignment-from-no-return
    861     for aux in self.aux_list:
    862         ts = self._auxs[aux].update_ts(ts)

File ~/MDAnalysis/mdanalysis/package/MDAnalysis/coordinates/DCD.py:198, in DCDReader._read_frame(self, i)
    196 """read frame i"""
    197 self._frame = i - 1
--> 198 self._file.seek(i)
    199 return self._read_next_timestep()

File MDAnalysis/lib/formats/libdcd.pyx:398, in MDAnalysis.lib.formats.libdcd.DCDFile.seek()

OSError: DCD seek failed with DCD error=Normal EOF

In [4]: u.trajectory
Out[4]: <ChainReader containing test.dcd, test.dcd with 117600 frames of 3341 atoms>

@orbeckst
Copy link
Member

And it has nothing to do with MemoryReader

In [6]: for ts in mda.log.ProgressBar(u.trajectory):
   ...:     pass
   ...:
 45%|███████████████                  | 53458/117600 [00:04<00:05, 12304.21it/s]
 ---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In [6], line 1
----> 1 for ts in mda.log.ProgressBar(u.trajectory):
      2     pass
...
File ~/MDAnalysis/mdanalysis/package/MDAnalysis/coordinates/DCD.py:198, in DCDReader._read_frame(self, i)
    196 """read frame i"""
    197 self._frame = i - 1
--> 198 self._file.seek(i)
    199 return self._read_next_timestep()

File MDAnalysis/lib/formats/libdcd.pyx:398, in MDAnalysis.lib.formats.libdcd.DCDFile.seek()

OSError: DCD seek failed with DCD error=Normal EOF

@hmacdope
Copy link
Member

@orbeckst what happens if you use pre- #3882 ? I can check also at some point, but good to verify on same box?

@orbeckst
Copy link
Member

orbeckst commented Feb 28, 2023

I installed at 2d98ad5 and I can read the trajectory (see below). This is a regression introduced after 2d98ad5 (possibly #3882) but I haven't run git bisect — not enough time.

In [6]: u_long = mda.Universe(PSF, 'test.dcd', 'test.dcd')
In [7]: for ts in mda.log.ProgressBar(u_long.trajectory):
   ...:     pass
   ...:
100%|█████████████████████████████████| 117600/117600 [00:14<00:00, 8292.91it/s]

create_test.py

Run once to create the test trajectory:

import MDAnalysis as mda
from MDAnalysis.tests.datafiles import PSF, DCD

u = mda.Universe(PSF, DCD)

# creating a big trajectory (2.2 GB)
ag = u.select_atoms("all")
with mda.Writer('test.dcd', ag.n_atoms) as w:
    for _ in mda.log.ProgressBar(range(600)):
        for ts in u.trajectory:
            w.write(ag)

(I haven't tested if the trajectory size (2.2G) or the number of frames is the issue.)

read trajectory

Should go through without raising an exception

import MDAnalysis as mda
from MDAnalysisTests.datafiles import PSF

u_long = mda.Universe(PSF, 'test.dcd', 'test.dcd')
for ts in mda.log.ProgressBar(u_long.trajectory):
    pass

@orbeckst
Copy link
Member

Instead of git bisect I quickly built at d63cd6d (i.e. #3882) ... and it fails so @hmacdope your intuition was correct.

$ python test_long_traj.py
/Users/oliver/anaconda3/envs/mda310pypi/lib/python3.10/site-packages/MDAnalysis/coordinates/DCD.py:165: DeprecationWarning: DCDReader currently makes independent timesteps by copying self.ts while other readers update self.ts inplace. This behaviour will be changed in 3.0 to be the same as other readers
  warnings.warn("DCDReader currently makes independent timesteps"
 45%|███████████████                  | 53458/117600 [00:04<00:05, 11626.72it/s]
...
  File "~/anaconda3/envs/mda310pypi/lib/python3.10/site-packages/MDAnalysis/coordinates/DCD.py", line 196, in _read_frame
    self._file.seek(i)
  File "MDAnalysis/lib/formats/libdcd.pyx", line 398, in MDAnalysis.lib.formats.libdcd.DCDFile.seek
OSError: DCD seek failed with DCD error=Normal EOF

@orbeckst
Copy link
Member

@hmacdope do you know how to fix this? This is a pretty serious regression, given the prevalence of DCD files.

@hmacdope
Copy link
Member

Ill have a look, I think I know what the issue is.

@hmacdope
Copy link
Member

hmacdope commented Feb 28, 2023

I agree its high priority. I wonder why no tests picked it up at the time, suggests it may be size related which leads me to suspect some kind of overflow as #3882 #3888 added C typing.

@hmacdope
Copy link
Member

I may not be able to get to this in the next few days. @orbeckst do we want to revert #3888 in the meantime?

@IAlibay
Copy link
Member

IAlibay commented Feb 28, 2023

@hmacdope is this just affecting develop or a previous released version too?

@hmacdope
Copy link
Member

hmacdope commented Feb 28, 2023

Yep @IAlibay >= 2.4.0 is affected. 😞

@IAlibay
Copy link
Member

IAlibay commented Feb 28, 2023

Yep @IAlibay >= 2.4.0 is affected. 😞

happens, do we want to do a bugfix release?

@richardjgowers
Copy link
Member

richardjgowers commented Feb 28, 2023 via email

@orbeckst
Copy link
Member

The test is a bit annoying because we need to generate the trajectory but… eventually having a “big traj” test for all readers might be useful.

@hmacdope
Copy link
Member

Can probably be done programmatically for all readers in the Reader test base class thingo.

@sef43
Copy link

sef43 commented Mar 1, 2023

this line will overflow with int math when frame number and framesize are both greater than 50,000 ish:

offset = self._header_size + self._firstframesize + self._framesize * (frame - 1)

richardjgowers added a commit that referenced this issue Mar 1, 2023
use fio_size_t for all variables related to filesizes

fixes for #4039
richardjgowers added a commit that referenced this issue Mar 1, 2023
@hmacdope
Copy link
Member

hmacdope commented Mar 1, 2023

@sef43 thanks for the tip! much appreciated. I think @richardjgowers has sorted this in his fix (#4048). Many thanks for cleaning up after me.

IAlibay pushed a commit that referenced this issue Mar 29, 2023
Fixes #4039
* Fixes DCD seeking for large (2Gb+) files.
IAlibay pushed a commit that referenced this issue Mar 29, 2023
Fixes #4039
* Fixes DCD seeking for large (2Gb+) files.
@github-staff github-staff deleted a comment from AHMED-salah00 Oct 23, 2024
@qlearn-code
Copy link

I am having the same problem. Mdanalysis version is 2.8.

@orbeckst
Copy link
Member

orbeckst commented Jan 2, 2025

@qlearn-code can you please open a new issue?

Describe your problem and you can mention this issue.

Until we determine that your problem has the same cause as this one (which was supposedly fixed) we want to discuss it independently as this helps us to better keep track of problems. Thank you.

@qlearn-code
Copy link

qlearn-code commented Jan 2, 2025 via email

@qlearn-code
Copy link

qlearn-code commented Jan 2, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants