-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[nexus] Add tests for instance_create unwind safety #1843
Conversation
common/Cargo.toml
Outdated
@@ -25,7 +25,7 @@ serde_json = "1.0" | |||
serde_with = "2.0.1" | |||
slog = { version = "2.5", features = [ "max_level_trace", "release_max_level_debug" ] } | |||
smf = "0.2" | |||
steno = "0.2" | |||
steno = { git = "https://github.com/oxidecomputer/steno", branch = "node-access" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, I'd use a new rev of steno instead before merging. Same elsewhere in this PR
let params = new_test_params(&opctx, project_id); | ||
let dag = create_saga_dag::<SagaInstanceCreate>(params).unwrap(); | ||
|
||
for i in 0..dag.get_node_count() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the "more automated" version of what I tried doing in #1835 - I can definitely consolidate the testing here, and create a "saga unwind tester", if that would be useful.
I basically see three phases:
- Setup the saga. Create any orgs / projects / disks / etc needed to run the saga.
- Iterate over every single node in the DAG, injecting failures at each one. Log which ones we're working on, to make it clear what went wrong if an error occurs. Run the saga.
- Assert that no detritus remains, and that the system is in the same state before / after saga execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this looks great. I look forward to this going back so I can do the same things
with the volume sagas.
Not actually. There is definitely still a dependency, but the "test setup" part can be re-used with Nexus tests without causing any real circular dependency issues. See #1835 (comment) for an example of one such issue. Frankly the rest of nexus-test-utils probably *should* continue to be refactored such that there is no real dependency on Nexus; doing so will reduce the risk of "mis-matched versions of Nexus" classes of errors in the future. This required the following changes: - Move `nexus`'s config information into `common/src/nexus_config.rs`, so that it can be used by `nexus-test-utils`. - Make `nexus-test-utils` operate on a generic version of `NexusServer`, exposed by a new crate named `nexus-test-interface`. This crate can contain any information used to operate Nexus during shared test setup, without actually requiring a full dependency on `nexus`. - Note, adding this generic required making `ControlPlaneTestContext` generic, which ended up bubbling up to a lot of test setup functions, including the one used in the `#[nexus_test]` macro... - ... So, make it possible to customize which version of `NexusServer` is used by `#[nexus_test]`. This means callers can supply their *own* version of `NexusServer`, either from their own crate (see: Nexus unit tests) or the version from the `omicron_nexus` crate (see: Nexus integration tests).
Builds on #1835 and oxidecomputer/steno#67
Adds a test for the "instance create" saga's unwind safety by: