Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define a saga for instance start #3873

Merged
merged 2 commits into from
Aug 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 40 additions & 27 deletions nexus/src/app/instance.rs
Original file line number Diff line number Diff line change
Expand Up @@ -475,41 +475,54 @@ impl super::Nexus {
self.db_datastore.instance_refetch(opctx, &authz_instance).await
}

/// Make sure the given Instance is running.
/// Attempts to start an instance if it is currently stopped.
pub async fn instance_start(
&self,
self: &Arc<Self>,
opctx: &OpContext,
instance_lookup: &lookup::Instance<'_>,
) -> UpdateResult<db::model::Instance> {
// TODO(#2824): This needs to be a saga for crash resiliency
// purposes (otherwise the instance can be leaked if Nexus crashes
// between registration and instance start).
let (.., authz_instance, mut db_instance) =
instance_lookup.fetch().await?;

// The instance is not really being "created" (it already exists from
// the caller's perspective), but if it does not exist on its sled, the
// target sled agent will populate its instance manager with the
// contents of this modified record, and that record needs to allow a
// transition to the Starting state.
let (.., authz_instance, db_instance) =
instance_lookup.fetch_for(authz::Action::Modify).await?;

// If the instance is already starting or running, succeed immediately
// for idempotency. If the instance is stopped, try to start it. In all
// other cases return an error describing the state conflict.
//
// If the instance does exist on this sled, this initial runtime state
// is ignored.
let initial_runtime = nexus_db_model::InstanceRuntimeState {
state: nexus_db_model::InstanceState(InstanceState::Creating),
..db_instance.runtime_state
// The "Creating" state is not permitted here (even though a request to
// create can include a request to start the instance) because an
// instance that is still being created may not be ready to start yet
// (e.g. its disks may not yet be attached).
//
// If the instance is stopped, the start saga will try to change the
// instance's state to Starting and increment the instance's state
// generation number. If this increment fails (because someone else has
// changed the state), the saga fails. See the saga comments for more
// details on how this synchronization works.
match db_instance.runtime_state.state.0 {
InstanceState::Starting | InstanceState::Running => {
return Ok(db_instance)
}
InstanceState::Stopped => {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The saga itself seems fine accepting instances in the Creating state -- is that not acceptable here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have to be careful about that here. There's a small window in the create saga where the instance record is in CRDB (and has the Creating state) while the objects that support it (external IPs, attached disks, etc.) are still being set up, and we don't want to allow a start request to go through while we're in that state.

The saga accepts Creating as a prior state so that it can be used as a subsaga from within the instance create saga (even though that's not hooked up yet). I could buy, though, that a better approach is to have the create saga fully create the instance, then unconditionally move it to Stopped, and then have it invoke the start saga if needed. That would make everything very consistent at the cost of having a Creating instance (where the create request has start: true) briefly go to Stopped before entering Starting. That seems to me like a smallish price to pay (and arguably it's strictly better because it's more accurate). I'll mull this over, but WDYT of this approach?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that approach. I was considering other options, but they all kinda seems like they incur an equivalent CRDB write to make the instance transition from "being constructed, not visible" to "constructed enough that we can correctly start it", which is basically what you're also proposing with the intermediate stopped state.

Copy link
Contributor Author

@gjcolombo gjcolombo Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, having slept on it, I'm comfortable saying the value of having a uniform start path outweighs the benefit of going directly from Creating to Starting without very briefly passing through Stopped first. Filed #3883 to track the remaining work here.

_ => {
return Err(Error::conflict(&format!(
"instance is in state {} but must be {} to be started",
db_instance.runtime_state.state.0,
InstanceState::Stopped
)))
}
}
Comment on lines +501 to +513
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any concern of TOCTTOU here? Kinda seems like this state checking could/should be part of the saga?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be covered here by the record's state generation: if the state changes, the saga will fail to transition from Stopped to Starting (because its generation number will be outdated) and will bail. This still isn't a perfect scheme, since it's possible for the saga to hit the "oops the state changed" condition even if the state has come to rest at Stopped. To completely fix all this I think we need to fix another bug first (bet you can guess which one...).

I'll add a comment here about the possible TOCTTOU and how we avoid it. There's another comment in the saga explaining the other remaining problems with the synchronization scheme in this PR.


let saga_params = sagas::instance_start::Params {
serialized_authn: authn::saga::Serialized::for_opctx(opctx),
instance: db_instance,
ensure_network: true,
};
db_instance.runtime_state = initial_runtime;
self.instance_ensure_registered(opctx, &authz_instance, &db_instance)
.await?;

self.instance_request_state(
opctx,
&authz_instance,
&db_instance,
InstanceStateRequested::Running,
self.execute_saga::<sagas::instance_start::SagaInstanceStart>(
saga_params,
)
.await?;

self.db_datastore.instance_refetch(opctx, &authz_instance).await
}

Expand Down Expand Up @@ -1220,7 +1233,7 @@ impl super::Nexus {
}

// Switches with uplinks configured and boundary services enabled
async fn boundary_switches(
pub async fn boundary_switches(
&self,
opctx: &OpContext,
) -> Result<HashSet<SwitchLocation>, Error> {
Expand Down
Loading