Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge integration tests together #497

Merged
merged 3 commits into from
Dec 9, 2021
Merged

Conversation

davepacheco
Copy link
Collaborator

This change moves all 19 of the existing integration tests into individual modules within a new "integration_test" module and includes that module from a single top-level integration test. This has two main benefits:

  1. The build process is faster because we only build one integration test crate instead of 19. Prior to [nexus] Refactor test-utilities to helper crate, add test benchmarks #492, these share a fair bit of code that gets built 19 times. Even with [nexus] Refactor test-utilities to helper crate, add test benchmarks #492, building each of these takes some time.
  2. The test process is faster because Cargo runs individual integration tests sequentially, but each test is run with many threads for concurrency. So before, all 19 tests were sequential. Now, they're parallelized. This ought to be a huge win for tests (or parts of tests) that are I/O bound. At least 13 of the tests today set up a local CockroachDB cluster, which takes several seconds prior to Optimize CockroachDB setup for tests (96% reduction) #493. Even with that fix, there's still potentially a lot of room for parallelism here.

Here are some numbers from my test machine ivanova, which is an 8-core (16-thread) Matisse with 64 GiB of memory running helios-1.0.20742. I measured two things:

  • "full test suite": Time for a full cargo test after first cargo clean, then cargo test. (That is, I threw the first cargo test away.) This is measuring the test suite time, ignoring build time.
  • "nexus tests only": Time for a cargo test -p omicron-nexus. This is after a throwaway run, since the first one of these will rebuild a bunch of deps due to different feature unification behavior when building an individual package.

For "main", I used commit 0e9e790, which was the tip when I started.

The tip of "main" when I started was commit 0e9e790. I measured my changes on commit 46203b0.

Full test suite on "main":

$ time cargo test; echo $?
real    2m40.627s
user    2m31.468s
sys     0m38.541s
0

Full test suite with my change:

$ time cargo test; echo $?
real    1m14.012s
user    2m30.997s
sys     0m42.062s
0

Nexus-only test suite on "main":

$ time cargo test -p omicron-nexus; echo $?
real    2m4.570s
user    2m3.419s
sys     0m27.980s
0

Nexus-only test suite with my changes:

$ time cargo test -p omicron-nexus
real    0m46.063s
user    2m1.738s
sys     0m33.486s
$ echo $?
0

This is more than a 50% reduction in the test suite time. The observed benefit will depend on available cores. The user and system CPU time hasn't changed much, which I believe reflects that a lot of the serialization was on I/O.

I don't think there's much downside to this. The tests are still in separate files so we don't lose the separation in code structure.

@davepacheco davepacheco changed the title Merge integration tests Merge integration tests together Dec 9, 2021
Copy link
Contributor

@david-crespo david-crespo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was way simpler than what I pictured. If you create a new file and forget to add it to the list, how do you find out? Will it fail to compile and run because super isn't anything?

@smklein
Copy link
Collaborator

smklein commented Dec 9, 2021

That was way simpler than what I pictured. If you create a new file and forget to add it to the list, how do you find out? Will it fail to compile and run because super isn't anything?

I believe cargo actually emits an error for .rs files not mod-ed into a crate from anywhere

Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, love those speed up numbers

@davepacheco
Copy link
Collaborator Author

Thanks for the quick review! I'm going to go ahead and land this. It will conflict a bit with #492 -- I'm happy to help with that merge.

@davepacheco davepacheco merged commit a628762 into main Dec 9, 2021
@davepacheco davepacheco deleted the merge-integration-tests branch December 9, 2021 20:49
@rcgoodfellow
Copy link
Contributor

I think we should revisit this. Today, an incremental compile of the integration test suite takes ~10 minutes. Because everything is now in a single translation unit, any update to tests or code under test results in 10 minutes in the penalty box. Even for changes to integration tests that just use Nexus via the API with no actual changes to Nexus itself, and using -p omicron-nexus, results in this 10-minute loss of time.

@davepacheco
Copy link
Collaborator Author

Hmm, yeah, those times suck.

If you want to be doing cargo nextest run -p omicron-nexus --test=some_specific_integration_test, then I can see how splitting up the integration tests could improve the lap time when you've modified just one integration test or some code in Nexus. I'd be curious how much of a speedup you'd get because you'd still have to rebuild all of Nexus (in the second case, anyway) and link it against one integration test binary.

I think if we separated them, these would get worse (potentially by quite a lot but I don't know):

  • incremental builds of the whole Nexus test suite (because you'd have to build a few dozen crates instead of just one)
  • full builds (same reason)
  • any run of the whole Nexus test suite (because cargo runs these tests sequentially, whereas now we should be getting a good deal of parallelism)

I think this change would be pretty easy to reverse and measure the impact?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants