Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define a saga for instance start #3873

Merged
merged 2 commits into from
Aug 13, 2023
Merged

Conversation

gjcolombo
Copy link
Contributor

Create a saga that starts instances. This has the following immediate benefits:

  • It's no longer possible to leak an instance registration during start; previously this could happen if Nexus crashed while handling a start call.
  • The saga synchronizes properly with concurrent attempts to delete an instance; the existing start routine may not be handling this correctly (it can look up an instance and decide it's OK to start, then start talking to sled agent about it while a deletion saga runs concurrently and deletes the instance).
  • The saga establishes networking state (Dendrite NAT entries, OPTE V2P mappings) for a newly started instance if it wasn't previously established. This is a stopgap measure to ensure that this state exists when restarting an instance after a cluster is restarted. It should eventually be replaced by a step that triggers the appropriate networking RPW(s).

This saga can also be used, at least in theory, as a subsaga of the instance create saga to replace that saga's logic for starting a newly-created instance. This work isn't done in this PR, though. (The change isn't trivial because the new start saga expects a prior instance record as a parameter, and the create saga can't construct a priori the instance record it intends to insert into CRDB.)

Tested via assorted new cargo tests and by launching a dev cluster with the changes, stopping an instance, restarting it, and verifying that the instance restarted correctly and that Nexus logs contained the expected log lines.

Fixes #2824. Fixes #3813.

Create a saga that starts instances. This has some immediate benefits:

- It's no longer possible to leak an instance registration during start;
  previously this could happen if Nexus crashed while handling a start call.
- The saga synchronizes properly with concurrent attempts to delete an instance;
  the existing start routine may not be handling this correctly (it can look up
  an instance and decide it's OK to start, then start talking to sled agent
  about it while a deletion saga runs concurrently and deletes the instance).
- The saga establishes networking state (Dendrite NAT entries, OPTE V2P
  mappings) for a newly started instance if it wasn't previously established.
  This is a stopgap measure to ensure that this state exists when restarting
  an instance after a cluster is restarted. It should eventually be replaced by
  a step that triggers the appropriate networking RPW(s).

This saga can be used, at least in theory, as a subsaga of the instance create
saga to replace that saga's logic for starting a newly-created instance. This
work isn't done in this PR, though. (The change isn't trivial because the new
start saga expects a prior instance record as a parameter, and the create saga
can't construct *a priori* the instance record it intends to insert into CRDB.)
@gjcolombo gjcolombo requested a review from smklein August 11, 2023 00:19
Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for these tests!

Comment on lines +490 to +502
match db_instance.runtime_state.state.0 {
InstanceState::Starting | InstanceState::Running => {
return Ok(db_instance)
}
InstanceState::Stopped => {}
_ => {
return Err(Error::conflict(&format!(
"instance is in state {} but must be {} to be started",
db_instance.runtime_state.state.0,
InstanceState::Stopped
)))
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any concern of TOCTTOU here? Kinda seems like this state checking could/should be part of the saga?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be covered here by the record's state generation: if the state changes, the saga will fail to transition from Stopped to Starting (because its generation number will be outdated) and will bail. This still isn't a perfect scheme, since it's possible for the saga to hit the "oops the state changed" condition even if the state has come to rest at Stopped. To completely fix all this I think we need to fix another bug first (bet you can guess which one...).

I'll add a comment here about the possible TOCTTOU and how we avoid it. There's another comment in the saga explaining the other remaining problems with the synchronization scheme in this PR.

InstanceState::Starting | InstanceState::Running => {
return Ok(db_instance)
}
InstanceState::Stopped => {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The saga itself seems fine accepting instances in the Creating state -- is that not acceptable here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have to be careful about that here. There's a small window in the create saga where the instance record is in CRDB (and has the Creating state) while the objects that support it (external IPs, attached disks, etc.) are still being set up, and we don't want to allow a start request to go through while we're in that state.

The saga accepts Creating as a prior state so that it can be used as a subsaga from within the instance create saga (even though that's not hooked up yet). I could buy, though, that a better approach is to have the create saga fully create the instance, then unconditionally move it to Stopped, and then have it invoke the start saga if needed. That would make everything very consistent at the cost of having a Creating instance (where the create request has start: true) briefly go to Stopped before entering Starting. That seems to me like a smallish price to pay (and arguably it's strictly better because it's more accurate). I'll mull this over, but WDYT of this approach?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that approach. I was considering other options, but they all kinda seems like they incur an equivalent CRDB write to make the instance transition from "being constructed, not visible" to "constructed enough that we can correctly start it", which is basically what you're also proposing with the intermediate stopped state.

Copy link
Contributor Author

@gjcolombo gjcolombo Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, having slept on it, I'm comfortable saying the value of having a uniform start path outweighs the benefit of going directly from Creating to Starting without very briefly passing through Stopped first. Filed #3883 to track the remaining work here.

nexus/src/app/sagas/instance_start.rs Show resolved Hide resolved
nexus/src/app/sagas/instance_start.rs Show resolved Hide resolved
@gjcolombo gjcolombo enabled auto-merge (squash) August 13, 2023 03:11
@gjcolombo
Copy link
Contributor Author

Thanks as always for the review! I've added some comments in d59b4cf to cover the topics discussed above.

I'll address the remaining open question about whether the instance create saga should move instances to Stopped before starting them in whatever follow-on PR connects the create saga to the start saga.

@gjcolombo gjcolombo merged commit b50907c into main Aug 13, 2023
@gjcolombo gjcolombo deleted the gjcolombo/instance-start-saga branch August 13, 2023 04:30
gjcolombo added a commit that referenced this pull request Oct 13, 2023
Finding boundary switches in the instance start saga requires fleet
query access. Use the Nexus instance allocation context to get it
instead of the saga initiator's operation context. (This was introduced
in #3873; it wasn't previously a problem because the instance create
saga set up instance NAT state, and that saga received a boundary switch
list as a parameter, which parameter was generated by using the instance
allocation context. #4194 made this worse by making instance create use
the start saga to start instances instead of using its own saga nodes.)

Update the instance-in-silo integration test to make sure that instances
created by a silo collaborator actually start. This is unfortunately not
very elegant. The `instance_simulate` function family normally uses
`OpContext::for_tests` to get an operation context, but that context is
associated with a user that isn't in the test silo. To get around this,
add some simulation interfaces that take an explicit `OpContext` and
then generate one corresponding to the silo user for the test case in
question. It seems like it'd be nicer to give helper routines like
`instance_simulate` access to a context that is omnipotent across all
silos, but I wasn't sure how best to do this. I'm definitely open to
suggestions here.

Tested via cargo tests.

Fixes #4272.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Instance start does not independently ensure v2p/dpd state exists Instance start needs to be a saga
2 participants