-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: attempt to shard huge test targets more #98834
Conversation
7e0de19
to
aa5994c
Compare
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
0da47f0
to
b311d92
Compare
b311d92
to
f91c8a7
Compare
@@ -3376,45 +3376,59 @@ test_suite( | |||
) | |||
|
|||
test_suite( | |||
name = "small_tests", | |||
name = "ccl_tests", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not maintain the small, medium, large, enormous categories for CCL tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Because we don't use them, we can easily add them if we need to though.
@@ -12,8 +12,11 @@ go_test( | |||
"//pkg/sql/logictest:testdata", # keep | |||
"//pkg/sql/opt/exec/execbuilder:testdata", # keep | |||
], | |||
shard_count = 16, | |||
tags = ["cpu:2"], | |||
shard_count = 48, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few questions:
- How does shard count interact with
cpu:2
(orcpu:1
in other cases)? - Why change some to 48 but not others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change some to 48 but not others?
We upload bazel trace profile to artifacts on each unit tests run. I looked at some runs, found targets that had large shards or bottle-necking the process, and sharded them more. Mainly, logictests, backupccl, schemachangerccl, kvserver. If a logictests package has less than 48 tests, it gets n number of shards where n is the number of tests.
How does shard count interact with cpu:2 (or cpu:1 in other cases)?
Good question. Looking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does shard count interact with cpu:2 (or cpu:1 in other cases)?
Each shard gets n cores if cpu:n
is used. I tried to play with it but didn't notice any changes so preferred to keep it unchanged.
c5b816a
to
8d35f3d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me.
pkg/cmd/generate-bazel-extra/main.go
Outdated
tags = [ | ||
"-broken_in_bazel", | ||
"-flaky", | ||
"-integration", | ||
"%[1]s", | ||
"-ccl_test" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Keep the list alphabetized? (Put -ccl_test
after -broken_in_bazel
)
LGTM for backupccl |
This code change attempts to shard some test targets that could benefit from more sharding. It also splits unit tests in TeamCity into two runs: ccl unit tests, and non-ccl unit tests. The current build config will be used to run non-ccl unit tests. The new build config will be used for ccl tests (those under `pkg/ccl`). This cuts unit tests wall time by half while keeping machine time almost the same. Release note: None Epic: none
0d5f80e
to
69921eb
Compare
TFTRs! bors r=rickystewart |
Build succeeded: |
This code change attempts to shard some test
targets that could benefit from more sharding. It
also splits unit tests in TeamCity into two runs:
ccl unit tests, and non-ccl unit tests. The current
build config will be used to run non-ccl unit tests.
The new build config will be used for ccl tests (those
under
pkg/ccl
). This cuts unit tests wall time byhalf while keeping machine time almost the same.
Release note: None
Epic: none