-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ObjectStore] Add append
API impl for LocalFileSystem
#3824
Conversation
append
API impl for LocalFileSystem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thank you @metesynnada
I also verified that the semantics match what is described on
arrow-rs/object_store/src/lib.rs
Lines 315 to 334 in fb35d26
/// Returns an [`AsyncWrite`] that can be used to append to the object at `location` | |
/// | |
/// A new object will be created if it doesn't already exist, otherwise it will be | |
/// opened, with subsequent writes appended to the end. | |
/// | |
/// This operation cannot be supported by all stores, most use-cases should prefer | |
/// [`ObjectStore::put`] and [`ObjectStore::put_multipart`] for better portability | |
/// and stronger guarantees | |
/// | |
/// This API is not guaranteed to be atomic, in particular | |
/// | |
/// * On error, `location` may contain partial data | |
/// * Concurrent calls to [`ObjectStore::list`] may return partially written objects | |
/// * Concurrent calls to [`ObjectStore::get`] may return partially written data | |
/// * Concurrent calls to [`ObjectStore::put`] may result in data loss / corruption | |
/// * Concurrent calls to [`ObjectStore::append`] may result in data loss / corruption | |
/// | |
/// Additionally some stores, such as Azure, may only support appending to objects created | |
/// with [`ObjectStore::append`], and not with [`ObjectStore::put`], [`ObjectStore::copy`], or | |
/// [`ObjectStore::put_multipart`] |
cc @tustvold
@@ -194,7 +194,7 @@ impl<W: Write> Writer<W> { | |||
} | |||
|
|||
/// A CSV writer builder | |||
#[derive(Debug)] | |||
#[derive(Clone, Debug)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My experience of target-specific feature selection has been that they cause very strange and hard to debug problems, the ahash getrandom nonsense is a frequent source of pain.
I wonder if we could instead add a default-disabled top-level feature, maybe append_local
that gates this functionality? This would also help with flagging the functionality as not part of the standard object store contract.
IMO the target-specific selection in this case is straightforward/surgical and I don't immediately see any potential headaches related to this in the future. Given that the only target this doesn't support this is WASM, keeping the selection seems OK to me. BTW, I am not categorically against the @alamb, @tustvold: Let us know what you think and we can finalize accordingly. Thanks for the reviews! |
I am not sure about the WASM usecase in general for object_store (as in I don't really understand why it is an an important target to support for object_store at all). If we are looking for simplifications maybe we can remove that feature support instead of making a new feature flag 🤔 I agree with @ozankabak that making a new feature flag for append seems overly complicated. I think this would be fine to merge as is. What do you think @tustvold ? |
Fine with me, was just a suggestion. |
Thanks everyone! |
Benchmark runs are scheduled for baseline = c96274a and contender = 9ce0ebb. 9ce0ebb is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
* Append Push API * wasm is not enabled.
* Append Push API * wasm is not enabled.
Which issue does this PR close?
Closes #3740 and closes #3742 since we go down a different path.
Rationale for this change
This PR adds the
LocalFileSystem
implementation for the "append" API.What changes are included in this PR?
LocalFileSystem
.However, to simplify things, we used the
tokio::fs
feature which is not supported onwasm32
. There is another implementation on synnada-ai#5 that alters theLocalUpload
struct to support both staged and unstaged file writings.Are there any user-facing changes?
No.