-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cicd: set test-threads to 16 and add retries to reduce flaky failures #1507
Conversation
ca336c9
to
78b6e11
Compare
# cancel redundant builds | ||
concurrency: | ||
group: ${{ github.workflow }}-${{ github.ref }} | ||
cancel-in-progress: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
canceling redundant builds to reduce load on CI
@@ -11,6 +11,12 @@ on: | |||
|
|||
env: | |||
HAS_BUILDPULSE_SECRETS: ${{ secrets.BUILDPULSE_ACCESS_KEY_ID != '' && secrets.BUILDPULSE_SECRET_ACCESS_KEY != '' }} | |||
CARGO_INCREMENTAL: "0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slightly faster builds and smaller cache sizes when disabling incremental compilation as described here https://matklad.github.io/2021/09/04/fast-rust-builds.html
774e851
to
c09d2d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @geekflyer! Question: what's the reason for removing the partition argument?
the partition argument was useless, since it partitioned the testsuite into a single partition and then ran that single partition. The whole idea of the partition argument is to partition the test suite into multiple partitions and then have different test workers / machines run one of the partitions. In other words |
/land |
Forge run: https://github.com/aptos-labs/aptos-core/actions/runs/2525806940 |
This adjusts these settings to reduce the chances of flaky tests failing the test build.
Previously tests were implicitly running with test-threads=60 since that's the number of CPU cores our test runners have.
These settings are result of a lots of try and error and thousands of test runs, both on GHA, Circleci and Cirrus-CI.
Note that this essentially just a workaround. Longer term we should find out why certain tests fail when there is a higher number of test threads.