-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: perform bulk deletes during metadata cleanup #1763
refactor: perform bulk deletes during metadata cleanup #1763
Conversation
07ff2e1
to
217e6bd
Compare
@cmackenzie1 I looked this over and thought "coo coo" but I didn't like the intermediate iterators, so I put my thinking cap on and wondered if we could just feed streams into streams (sup dawg). I haven't done any testing other running the existing test suite, but I think this works. If you don't like it feel free to revert my commit out and we can discuss more 😄 |
7d1140d
to
1847b90
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thanks!
rust/src/protocol/checkpoints.rs
Outdated
.list(Some(storage.log_path())) | ||
.await? | ||
// Pass along only the Ok results from storage.list | ||
.filter(|res| futures::future::ready(res.is_ok())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the only reason I didn't take this approach first. If we are ok with not handling the error here that's fine with me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the nature of the function, I don't think we would want to halt execution on a single error would we?
Assuming that's correct, I I will modify the code to log errors but continue processing
In addition to doing bulk deletes, I removed what seems like (at least to me) unnecessary code. At it's core, files are considered up for deletion when their last_modified time is older than the cutoff time AND the version if less than the specific version (usually the latest version).
…aning up expired logs This change builds on @cmackenzie1's work and feeds the list stream directly into the delete_stream with a predicate function to identify paths for deletion
63f3f4c
to
22afcc7
Compare
Since I contributed here, I would like @wjones127 or @roeap to tap this one into main after a review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
Description
In addition to doing bulk deletes, I removed what seems like (at least to me) unnecessary code. At it's core, files are considered up for deletion when their last_modified time is older than the cutoff time AND the version is less than the specific version (usually the latest version).
Related Issue(s)
Documentation