Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Define a saga for instance start (#3873)
Create a saga that starts instances. This has the following immediate benefits: - It's no longer possible to leak an instance registration during start; previously this could happen if Nexus crashed while handling a start call. - The saga synchronizes properly with concurrent attempts to delete an instance; the existing start routine may not be handling this correctly (it can look up an instance and decide it's OK to start, then start talking to sled agent about it while a deletion saga runs concurrently and deletes the instance). - The saga establishes networking state (Dendrite NAT entries, OPTE V2P mappings) for a newly started instance if it wasn't previously established. This is a stopgap measure to ensure that this state exists when restarting an instance after a cluster is restarted. It should eventually be replaced by a step that triggers the appropriate networking RPW(s). This saga can also be used, at least in theory, as a subsaga of the instance create saga to replace that saga's logic for starting a newly-created instance. This work isn't done in this PR, though. (The change isn't trivial because the new start saga expects a prior instance record as a parameter, and the create saga can't construct *a priori* the instance record it intends to insert into CRDB.) Tested via assorted new cargo tests and by launching a dev cluster with the changes, stopping an instance, restarting it, and verifying that the instance restarted correctly and that Nexus logs contained the expected log lines. Fixes #2824. Fixes #3813.
- Loading branch information