-
Notifications
You must be signed in to change notification settings - Fork 841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get rid of persistent build errors due to non-atomic file writes in GHC and stack #4559
Comments
In case we need to add atomic writes for stack, in commercialhaskell/rio#138 @roman introduced atomic+durable file writes that we can use. However, as per this:
We probably want to add that since for many cases, atomic-but-not-durable is enough, and durability ( |
New task done:
Here comes I made it for my work on https://phabricator.haskell.org/D42, but it's very useful for this problem, too. |
My GHC patch for writing https://gitlab.haskell.org/ghc/ghc/merge_requests/391 This should fix the biggest source for these errors, but the other files GHC and its subprograms write need to be done. |
@nh2 I don't think this issue belongs on the Stack issue tracker, as it's entirely upstream. Any objection to closing it? |
The main GHC bug has been fixed, but I still wanted to do an investigation with hatrace on
to ensure that there's no case where stack may write non-atomic files. I'd like to write a hatrace filter like
|
In nh2/hatrace#9 @qrilka implemented a In nh2/hatrace#9 (comment) we recoded its output for a Especially interesting are the entries
which disappear after |
I think the fastest way to investigate it would be to add a command like |
This is referring to writes in GHC itself, not Stack, correct? |
@snoyberg the most of non-atomically written files are form GHC, I think we see some from Cabal e.g. from the
And during our call Niklas was able to reproduce Stack failure by cutting one of those |
We were testing 1.9.3 so it makes total sense to do proper testing with |
Yes, those look relevant. The last one suffers from non-atomic writes to |
GHC as of writing does not ensure that files are written atomically.
This means that a Ctrl+C, kill or reboot at the right time can result in truncated files.
GHC does not detect this, so that resuming/rerunning the build with
ghc --make
continues to show up error messages.In such situation, the only workaround is to wipe all files (e.g.
stack
/cabal
/make
clean
).Specifically, we've observed the following to happen:
There may be situations where
stack
has this problem too, but so far we believe all occurrences of this that we see are in GHC.GHC issue about this: #14533 - Make GHC more robust against PC crashes by using atomic writes
Repro
@lehins made a repro script that shows the problem in GHC at https://github.com/lehins/exec-kill-loop
Planned solution
rename()
syscall).hatrace
test)Related issues
Issues for which this may be the reason
The text was updated successfully, but these errors were encountered: