-
Notifications
You must be signed in to change notification settings - Fork 778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - Write validator definitions atomically #2338
[Merged by Bors] - Write validator definitions atomically #2338
Conversation
9d60924
to
75da0ed
Compare
MVP for reproducing the bug: use std::fs::File;
use std::io::Write;
fn main() {
let mut f1 = File::create("/lol/good_file.txt").expect("creating good file failed");
let mut f2 = File::create("/lol/junk_file.txt").expect("creating bad file failed");
let megabyte_of_crap = vec![0xff; 1024 * 1024];
f2.write_all(&megabyte_of_crap)
.expect("writing bad file failed");
f1.write_all(b"This is going to go missing :(")
.expect("writing good file failed");
} For use with a 1MB tempfs mounted on
If you create I used the same tempfs setup to simulate a validator datadir running out of space, and was not able to reproduce the bug using the updated code, implying that it is fixed. |
I've pushed commit 9d60924 to the Pyrmont nodes and the VCs restarted twice without issue (once from old->new, once from new->new). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good, but I have a comment regarding permissions :)
create_with_600_perms(&temp_path, &bytes).map_err(Error::UnableToWriteTempFile)?; | ||
|
||
// With the temporary file created, perform an atomic rename. | ||
fs::rename(&temp_path, &config_path).map_err(Error::UnableToRenameFile)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I haven't tested directly, I think this will technically be a breaking change since we will replace the current perms of the non-temp file with 600.
In effect, Lighthouse will continuously force its own file permissions upon that file. I'm not sure this is necessarily a bad thing, but perhaps it might hurt people with fancy setups?
Copying the permissions of the non-temp file to the temp file seems like the least-breaky approach, but I imagine that might get complex in a cross-platform world. Perhaps you've considered this already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good point. I'll look into this further tomorrow.
@ethDreamer if you have a sec and would like to chime in on the windows situation that would be great (but no worries if you're busy).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've found a good solution for the permissions that will work cross platform, but doesn't require us to block this PR on Windows support. The function write_file_via_temporary
uses fs::copy
to copy from primary file to temp (which copies permissions), writes to the temp file, and then moves it into place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, nice fix!
There's a potential for some weirdness during concurrent execution, but that's not introduced by this PR nor do I think it poses a material risk.
Merge at will!
Thank you! bors r+ |
## Issue Addressed Closes #2159 ## Proposed Changes Rather than trying to write the validator definitions to disk directly, use a temporary file called `.validator_defintions.yml.tmp` and then atomically rename it to `validator_definitions.yml`. This avoids truncating the primary file, which can cause permanent damage when the disk is full. The same treatment is also applied to the validator key cache, although the situation is less dire if it becomes corrupted because it can just be deleted without the user having to reimport keys or resupply passwords. ## Additional Info * `File::create` truncates upon opening: https://doc.rust-lang.org/std/fs/struct.File.html#method.create * `fs::rename` uses `rename` on UNIX and `MoveFileEx` on Windows: https://doc.rust-lang.org/std/fs/fn.rename.html * UNIX `rename` call is atomic: https://unix.stackexchange.com/questions/322038/is-mv-atomic-on-my-fs * Windows `MoveFileEx` is _not_ atomic in general, and Windows lacks any clear API for atomic file renames :( https://stackoverflow.com/questions/167414/is-an-atomic-file-rename-with-overwrite-possible-on-windows ## Further Work * Consider whether we want to try a different Windows syscall as part of #2333. The `rust-atomicwrites` crate seems promising, but actually uses the same syscall under the hood presently: untitaker/rust-atomicwrites#27.
Pull request successfully merged into unstable. Build succeeded: |
Issue Addressed
Closes #2159
Proposed Changes
Rather than trying to write the validator definitions to disk directly, use a temporary file called
.validator_defintions.yml.tmp
and then atomically rename it tovalidator_definitions.yml
. This avoids truncating the primary file, which can cause permanent damage when the disk is full.The same treatment is also applied to the validator key cache, although the situation is less dire if it becomes corrupted because it can just be deleted without the user having to reimport keys or resupply passwords.
Additional Info
File::create
truncates upon opening: https://doc.rust-lang.org/std/fs/struct.File.html#method.createfs::rename
usesrename
on UNIX andMoveFileEx
on Windows: https://doc.rust-lang.org/std/fs/fn.rename.htmlrename
call is atomic: https://unix.stackexchange.com/questions/322038/is-mv-atomic-on-my-fsMoveFileEx
is not atomic in general, and Windows lacks any clear API for atomic file renames :(https://stackoverflow.com/questions/167414/is-an-atomic-file-rename-with-overwrite-possible-on-windows
Further Work
rust-atomicwrites
crate seems promising, but actually uses the same syscall under the hood presently: MoveFileEx might be not atomic untitaker/rust-atomicwrites#27.