-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: impl rename_noreplace with std::fs::hard_link by default #621
Conversation
All tests are passing, including Python on old glibc (💯). I'm throwing out a fair amount of code in this though, so I'd welcome any critical feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea @wjones127! My current thinking on this is we could use hardlink as the fallback when native atomic rename is not available. Here is my reasoning for keeping renameat2:
- Hardlink followed by remove is not atomic, so in very rare edge cases, we could leave the old path behind
- It is less efficient compared to renameat2 because we need to issue 2 system calls instead of just one.
Both of the above shortcomings probably won't matter much in most of the production workloads because deltalake is designed for low frequency commits, but I think the tradeoff is worth it in this case because the added complexity is manageable (about 200+ LoC). Happy to hear your thought on this as well.
rust/src/storage/file/rename.rs
Outdated
Ok(()) | ||
}) | ||
.await | ||
.unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we propagate the io error back to the caller instead?
This is already true for the S3 and GCS, right? So only Azure and local filesystems support this operation with that kind of atomicity. I think object_store_rs is aiming to support operations that are common across object stores, with similar guarantees across them. Why enforce that atomicity for some but not others? And Windows is not atomic in the way we want right now; it does a check for a file and then runs replace, which could overwrite if there is a concurrent writer. So I think
True, but we only run this operation once per a commit, so it's doubtful there's a performance hit in the big picture. In object_store_rs, I implemented So maybe for time being, I'll do this just for Windows and old glibc Linux, and we'll think more about whether we really need this for other platforms. |
This is what I have in mind as well. Just to clarify, I wasn't suggesting we add this to object_store_rs. I was suggesting we keep it within delta-rs assuming the maintenance burden is low. If that's not the case, I am cool with us switching to hardlink everywhere. Like I mentioned in my previous comment, I don't think the overhead and correctness issue would matter in production. I am only suggested keeping it because I would like delta-rs to adhere to Rust and C++'s zero cost abstraction philosophy as much as we could. I am also not the one who is writing the code, so I won't be blocking the merge for minor nitpicks like this :D
100% agree, I was trying to suggest that we use renameat2 if it's available, otherwise fallback to hardlink+remove. The current windows implementation is not multi-writer safe as mentioned in the fs module's documentation.
The performance trade offs between PUT if absent and atomic rename is quite nuance. My current understanding of this is that for non-append only and single writer append only commits, PUT if absent is more performant and correct because it only requires a single API call. For highly concurrent append only commits, atomic rename is more performant because it avoids repetitive object writes on subsequent retries. Because delta lake is not designed for highly current write use-cases, I think we should go with PUT if absent whenever possible. This means even though both Azure and HDFS support atomic rename, we probably shouldn't be using it. Therefore, I think we should probably rewrite the commit loop in delta-rs to use put if absent instead. |
Good points there; I haven't looked to far into that yet, but I might try that when writing the PR to switch to object_store_rs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Description
I've been working in influxdata/object_store_rs, with the goal of switching to that object store implementation in delta-rs (see #610). It was suggested to simply use hard_link to implement
rename_no_replace()
easily cross platform.Since this is causing issues in the Python writer right now (see #620), I'm thinking maybe I should patch this change immediately.
This PR provides a default implementation of
rename_noreplace()
usingstd::fs::hard_link
, which should be supported on all platforms. It also enables Python writer tests for all platforms.Related Issue(s)
Documentation