-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bus error on version 0.2.3 #246
Comments
Hi @garrison, Can you tell me more about the system and what you are trying to do? (e.g. file size, number of workers) According to Wikipedia Bus Errors tend to be quite rare and on linux systems mostly show up when there are problems with memory mapping. (so if you use Taking a wild guess, can you make sure that your disk quota is not full? Are you certain that different workers don't accidentally try to write/remove the same files? Edit: The stacktrace in #127 looks quite similar to yours. |
Hi @JonasIsensee. I am experiencing this error with exclusive access to a Regular Shared Memory node on Bridges, a machine at the Pittsburgh Supercomputing Center. The code uses only a small portion of the available RAM. I am using one worker per available core, so 28 workers. Different workers are not accessing the same files. I do not expect it is a quota issue, as I am well below quota, and the error is only intermittent. The successfully saved files are ~16 megabytes. I can try |
Oh, I hardly could have been more wrong then I had a look at the diffs between the latest patch releases of JLD2 and the changes that might have caused this (at least my best guess) were a part v0.2.1 (#231) So your errors should persist on v0.2.1. |
I've hit this twice today on
|
I have yet to experience the error since downgrading to 0.2.1. I was hitting it fairly regularly on 0.2.3. I haven't tried Another, less likely, possibility, is that this change to JLD2 is correct but somehow caused some other underlying issue with julia to surface. |
p.s. I am using julia v1.5.2. |
Thanks to both of you for trying to help figuring this out. You're right. #230 did change some lines in the Apparently in some cases misused / poorly implemented There is this JuliaLang/julia#28245 |
Btw. Is this reproducible for the exact same file? In that case this could somehow be if there is something wrong in the byte number calculation and we end up trying to write in a region that is not mapped. |
These are Monte Carlo simulations, so the data is different in each (different random seed) |
Nevermind; there is one |
I'm not sure how to proceed from here.
Can you tell what seeds produced the error ? And e.g. rerun for the exact same seed? If that can be used to reproduce the error would you consider privately sharing that data with me? |
The easiest way to get a bus error (JLD2, same stack trace) is to have 2 different processes writing to the same .jld file, but then that is a dangerous thing to do anyway |
Fair enough. But as you say, that is an evil thing to do and there is absolutely no reason to expect that to work. |
I agree, but maybe that is what OP was doing. |
I've been able to produce the error myself now. I was writing on 32 different worker processes (on the same machine) to independent files . No multithreading involved. |
Hi @garrison , do you still get this error? |
|
Since this hasn't occurred to me in ages and apparently no one else either, I'll close this for now. |
I have been experiencing intermittent Bus errors, I believe since upgrading to v0.2.3.
Here's a typical traceback:
Unfortunately I don't have a minimal test case, but I have now noticed this on two different (unrelated) codes paths, so I believe it is a problem with JLD2 (perhaps combined with some aspects of my configuration). Both code paths only began experiencing errors recently. I was using v0.2.1 previously.
The text was updated successfully, but these errors were encountered: