Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmake: atomic rename to fix toolchain cache creation race #15128

Merged
merged 1 commit into from
Apr 17, 2019

Conversation

marc-hb
Copy link
Collaborator

@marc-hb marc-hb commented Apr 2, 2019

While this race seems unlikely and harmless let's play it safe and
implement the usual solution: temporary file + atomic rename.

The race is unlikely and maybe even harmless because:

  • Only sanitycheck seems to invoke cmake concurrently.
  • Users rarely delete their ~/.cache/zephyr/ToolchainCapabilityDatabase/
  • All concurrent cmake processes write the same, single byte to the same
    files.
  • Creating a single byte is at least very fast, so extremely short
    window for others to read an empty file.

For additional background see links in issue #9992

Signed-off-by: Marc Herbert [email protected]

While this race seems unlikely and harmless let's play it safe and
implement the usual solution: temporary file + atomic rename.

The race is unlikely and maybe even harmless because:

- Only sanitycheck seems to invoke cmake concurrently.
- Users rarely delete their ~/.cache/zephyr/ToolchainCapabilityDatabase/
- All concurrent cmake processes write the same, single byte to the same
  files.
- Creating a single byte is at least very fast, so extremely short
  window for others to read an empty file.

For additional background see links in issue zephyrproject-rtos#9992

Signed-off-by: Marc Herbert <[email protected]>
@marc-hb marc-hb added area: Build System area: Sanitycheck Sanitycheck has been renamed to Twister bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug labels Apr 2, 2019
@marc-hb marc-hb requested review from carlescufi and mbolivar April 2, 2019 22:03
@mbolivar
Copy link
Contributor

mbolivar commented Apr 2, 2019

All concurrent cmake processes write the same, single byte to the same files.

It's been a while since I've read this file, but if that is true, then it sounds like the existence check + creation is idempotent and there is no race. What am I missing?

@marc-hb
Copy link
Collaborator Author

marc-hb commented Apr 3, 2019

What am I missing?

Two things:

  • The last item in the list in the commit message.
  • The official and documented guarantee that all filesystems on all supported operating systems handle such concurrent writes properly. Who knows, maybe sanitycheck will run even on Windows some day.

Easier to just play it safe, just a few lines change. Elusive race conditions tend to be very time-consuming.

@mbolivar
Copy link
Contributor

mbolivar commented Apr 3, 2019

The last item in the list in the commit message.

I think you're saying this is the race (I find it helpful to provide explicit sequences like this that demonstrate the bad behavior):

  1. Process A creates the file
  2. Process B checks for the file's existence and reads its contents, which are empty
  3. Process A writes the byte to the file

If that's possible, I agree this fixes the issue.

@marc-hb
Copy link
Collaborator Author

marc-hb commented Apr 3, 2019

(a small variation of) the race you described is most certainly possible with a "large" file.

It may or may not be possible with a small file, however it's much more productive to just avoid this race rather than researching the question for every filesystem in every operating system.

@SebastianBoe
Copy link
Collaborator

SebastianBoe commented Apr 3, 2019

For future reference:

before we had two processes writing the same single byte to the same file. After this change we will have two processes renaming two differently named single-byte-files (containing the same byte) to the same file. E.g. we believe that

echo "0" > tmp_${RANDOM}_foo.txt
mv tmp_${RANDOM}_foo.txt foo.txt

is more threadsafe than

echo "0" > foo.txt

I don't know enough about filesystems to question this, so this is OK by me. But i'd prefer it was merged outside of the stabilization period in case there is some unintended side-effect that we have not foreseen.

@nashif nashif merged commit f098b44 into zephyrproject-rtos:master Apr 17, 2019
@marc-hb marc-hb deleted the cmake-cache-race branch April 20, 2019 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Build System area: Sanitycheck Sanitycheck has been renamed to Twister bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants