-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[reconfigurator] SledEditor: be more strict about decommissioned sleds #7234
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a net improvement but I share your ambivalence.
I like making set_sled_state()
more specific to decommissioning. If we don't support other transitions today, then it seems like if/when we do, we'll want to think through what other updates need to be made. That might be easy to forget if somebody can just call set_sled_state(SomeOtherState)
.
nexus/reconfigurator/planning/src/blueprint_editor/sled_editor.rs
Outdated
Show resolved
Hide resolved
// We can't take ownership of `editor` from the `&mut self` | ||
// reference we have, and we need ownership to call | ||
// `finalize()`. Steal the contents via `mem::swap()` with an | ||
// empty editor. This isn't panic safe (i.e., if we panic | ||
// between the `mem::swap()` and the reassignment to `self.0` | ||
// below, we'll be left in the active state with an empty sled | ||
// editor), but omicron in general is not panic safe and aborts | ||
// on panic. Plus `finalize()` should never panic. | ||
let mut stolen = ActiveSledEditor::new_empty( | ||
DatasetIdsBackfillFromDb::empty(), | ||
); | ||
mem::swap(editor, &mut stolen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of all this, what about accepting an owned SledEditor
here and returning an owned one back? I think this means the caller would have to remove this sled editor from sled_editors
and insert the one it got back. But that seems okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I think this makes failures super awkward? If we take self
and return Result<SledEditor, SledEditError>
, then on failure our caller doesn't get an editor back. I guess we could return either (SledEditor, Result<(), SledEditError>)
or Result<SledEditor, (SledEditor, SledEditError)>
but either of those seems worse to me than the slightly-messy internal details here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough!
) -> impl Iterator<Item = &BlueprintPhysicalDiskConfig> { | ||
match &self.0 { | ||
InnerSledEditor::Active(editor) => { | ||
Either::Left(editor.disks(filter)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never seen Either
before. Is this just a way to have a function return either of two different iterators that otherwise would have different types? I think I usually use Box<dyn Iterator<...>>
instead. This is neat.
(I think I'm ambivalent between both approaches but it's cool to know this exists.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! This is from itertools. This one doesn't require boxing but does mean there's a match on Left/Right on every call to Iterator::next()
. I doubt either choice makes any meaningful difference in this case.
// TODO-john The disks and datasets checks below don't pass what the | ||
// planner does currently to decommission sleds: if a sled is expunged, | ||
// we'll omit its disks and datasets from the outgoing blueprint | ||
// entirely without setting them all to the `Expunged` disposition. | ||
// Fixing this will conflict with ongoing disk work, so for now these | ||
// checks are commented out. | ||
/* | ||
// Check that all disks are expunged... | ||
if let Some(disk) = | ||
self.disks(DiskFilter::All).find(|disk| match disk.disposition { | ||
BlueprintPhysicalDiskDisposition::InService => true, | ||
BlueprintPhysicalDiskDisposition::Expunged => false, | ||
}) | ||
{ | ||
return Err(SledEditError::NonDecommissionableDiskInService { | ||
disk_id: disk.id, | ||
zpool_id: disk.pool_id, | ||
}); | ||
} | ||
|
||
// ... and all datasets are expunged ... | ||
if let Some(dataset) = | ||
self.datasets(BlueprintDatasetFilter::All).find(|dataset| { | ||
match dataset.disposition { | ||
BlueprintDatasetDisposition::InService => true, | ||
BlueprintDatasetDisposition::Expunged => false, | ||
} | ||
}) | ||
{ | ||
return Err(SledEditError::NonDecommissionableDatasetInService { | ||
dataset_id: dataset.id, | ||
kind: dataset.kind.clone(), | ||
}); | ||
} | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You knew I was going to suggest putting this into an issue instead. 😄 It could be in a comment for an existing issue that covers the work that would introduce these checks or a standalone issue we track under Reconfigurator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of my comments is a blocker.
This is a followup from #7204 (comment) and makes two changes, neither of which should affect behavior:
SledEditor
will now fail if a caller attempts to make changes to a decommissioned sled (internally, this is statically enforced by a type state enum - a sled in the decommissioned state does not have any methods that support editing, so we're forced to return an error)SledEditor::set_state()
is nowSledEditor::decommission()
, and it performs some checks that the sled looks decommissionableThe second bullet is more questionable than I expected it to be:
SledEditor
shouldn't do any checks here; in particular, it doesn't have the full context (e.g., any checks on "should we decommission this sled" that depend on thePlanningInput
can't be performed here, becauseSledEditor
intentionally doesn't have access toPlanningInput
).I don't feel super strongly about the checks in
decommission()
or even this PR as a whole; if this doesn't look like a useful direction, I'd be fine with discarding it. Please review with a pretty critical eye.