-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[repo depot 2/n] sled agent APIs to manage update artifact storage #6764
Merged
Merged
Changes from 4 commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
a5052d0
sled agent APIs to manage update artifact storage
iliana ce1bc42
fn datasets -> fn dataset_mountpoints
iliana 5ad16a1
be more resilient in the face of io errors
iliana 485ee40
clean up temporary files on startup
iliana 26f4107
naming consistency
iliana 893980e
log.cleanup_successful();
iliana 03a51c7
Merge remote-tracking branch 'origin/main' into iliana/tuf-repo-depot
iliana efcfb92
document ArtifactStore
iliana e8b2673
fn put -> put_impl
iliana 3b15f3b
copy_from_depot should take a URL
iliana c909649
reduce semantic satiation
iliana 4325077
remove default type parameter
iliana 86b8047
StorageBackend -> DatasetsManager; attempt clean up
iliana 599089a
create reqwest client at startup, not on first use
iliana f624e5d
don't embed source error strings
iliana b44d06b
fewer contextless errors
iliana 013e67f
another docstring
iliana 5eefb6e
add the repo depot API to api-manifest.toml
iliana aebfe32
add list artifacts operation
iliana d743727
Merge remote-tracking branch 'origin/main' into iliana/tuf-repo-depot
iliana 2ae3743
create an update artifact dataset on both M.2s
iliana 45c4cac
PUT artifacts to all artifact datasets
iliana 3c2f866
change list API to return a count of each artifact
iliana 743a67b
make copy_from_depot create a task and return
iliana 58b9cbe
ls-apis expectorate
iliana 6906679
Merge remote-tracking branch 'origin/main' into iliana/tuf-repo-depot
iliana cd7dc7b
review comments
iliana e967f02
propagate non-fatal write errors to `finalize()`
iliana 508861d
expectoraaaaate
iliana 9f96f71
Merge remote-tracking branch 'origin/main' into iliana/tuf-repo-depot
iliana 32afac5
improved API responses for PUT/POST
iliana 4d00c16
document ArtifactPutResponse fields
iliana 9b3691a
Merge remote-tracking branch 'origin/main' into iliana/tuf-repo-depot
iliana File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, to clarify end-to-end behavior for the
PUT
operation from Nexus:From Nexus's perspective, it seems like we can't really distinguish between these cases through the PUT API. Do you think this matters?
My concerns is mostly "do we keep operating successfully, even in a scenario where we have reduced M.2 capacity".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW one possible solution here would be to propagate a result back through the API, that indicates:
Then this decision is punted up to Nexus, and Nexus could decide "at least one write count as success, but I'll log the error and keep moving".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nexus could call the
list
API to see if there was a partial success, I suppose; or maybe it's worth instead returning something like{"datasets": 2, "successful_writes": 1}
as a non-4xx error?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the actual answer to this question depends on the exact design of how Nexus is going to replicate artifacts across sleds. If Nexus is able to try another sled, maybe this current design is fine. But if it's a saga where the sleds are picked out in advance and there's no retry flow (this seems like a poor design) then it would be better to return OK here; at least there's a copy on this one sled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a hypothetical world where we have all sleds operating with one M.2 - shouldn't this be able to succeed? We are operating at reduced redundancy, but we do have a copy that got successfully written.
(Agreed that we could make all
PUT
calls query the list API afterwards? But if that's our inclination, this also seems like it should be part of the error result)basically, I think it's critical for Nexus to be able to distinguish between the cases of:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll implement the
{"datasets": 2, "successful_writes": 1}
200 OK response so that Nexus can make a decision with that information. I don't think it matters to return the error since sled agent is logging all I/O errors it runs into.