-
-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zarr DirectoryStore with temporary directory-based transaction #247
Comments
I guess you saw this already from the recent PRs, but writing an item is
already atomic, in the sense that the content is written to a temporary
file then moved into place.
Currently the temporary files are created in the same directory as the
destination files. This was to ensure that the move would always be cheap,
a simple rename. If the temporary file was in a different directory then
this could be on a different file system and so move would not be cheap.
We could consider allowing the user to specify a "partial dir" where
temporary files are written, but I'd like to have a motivating use case.
For deleting items, I'm surprised that deletion is not an atomic OS
operation. I thought deletion meant just unlinking the file? Not an expert
in these things tho.
…On 22 Mar 2018 6:00 pm, "jakirkham" ***@***.***> wrote:
On some file systems operations like writing and deleting can be a bit
slow and can sometimes fail. It would be nice in these cases to use a
temporary directory for intermediate steps. For instance writing of chunks
could occur in a temporary directory with each chunk only getting moved in
once the operation completes. Also deleting can simply move the content
into a temporary directory and perform the deletion. As operations like
rename are atomic on POSIX systems, this ensures at the end of operations
like writing and deleting content that the DirectoryStore is never in an
incomplete state. Further operations like deletion could be more easily
performed in parallel as the content already appears to be deleted even
though it merely got moved to a temporary directory somewhere where it is
still getting cleaned up.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#247>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAq8Qni1QuDrz6n77Oia5IxnIxXo140Gks5tg-asgaJpZM4S3h0K>
.
|
Yeah on NFS deletion is really slow. At least that has been my experience. Currently we move stuff to a temporary directory on the same filesystem and submit separate jobs to do the deletion in the background to workaround this issue. It's unfortunately actually that bad. Was coming around to the same idea of having a Zarr local temporary directory (e.g. |
A use-case for this feature has come up for us when writing to Zarr from jobs that may occasionally be pre-empted, e.g., using pre-emptible VMs on Google Cloud. This is generally very cost effective, but does mean that jobs need to be robust to being restarted at any time. Atomic writes for individual files are helpful, but ideally we would also like to either write the full Zarr store, or nothing at all. This might be done with context managers, e.g., with zarr.transaction():
dataset.to_zarr(...) # using xarray Or alternatively, perhaps via a custom store class, e.g., store = AtomicWriteStore(...)
dataset.to_zarr(store)
store.commit() |
On some file systems operations like writing and deleting can be a bit slow and can sometimes fail. It would be nice in these cases to use a temporary directory for intermediate steps. For instance writing of chunks could occur in a temporary directory with each chunk only getting moved in once the operation completes. Also deleting can simply move the content into a temporary directory and perform the deletion. As operations like
rename
are atomic on POSIX systems, this ensures at the end of operations like writing and deleting content that theDirectoryStore
is never in an incomplete state. Further operations like deletion could be more easily performed in parallel as the content already appears to be deleted even though it merely got moved to a temporary directory somewhere where it is still getting cleaned up.The text was updated successfully, but these errors were encountered: