-
Notifications
You must be signed in to change notification settings - Fork 870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Action inputs to dispatch n-runs of a single test in CI #6297
Conversation
…poral into cdf/rerun-functional-test
Can we create a separate workflow file for this purpose? |
A separate workflow file that we run only rarely will rot and be likely to be broken when someone wants to use it. Integrating it into the main one makes it much more likely to be maintained and working |
## What changed? Add workflow dispatch options to the functional tests Github Action to allow us to run n-iterations of a single functional test with a configurable timeout. There is also an option to run n-iterations of a single unit test, although it may be faster to run that locally. WARNING: For functional tests, this will definitely be oomkilled for n>=100, likely for n=>50 too. I suggest to start with n=20 to see how the memory goes and then increase from there. Different DBs may use different amounts of RAM also. ## Why? To aid in the diagnosis and treatment of flaky tests. ## How did you test it? Tested in github actions. Here is the action run normally, with no new input parameters: https://github.com/temporalio/temporal/actions/runs/11261156403 Here is the action run on one test multiple times: https://github.com/temporalio/temporal/actions/runs/11261236304 While we still have buildkite, the uploaded test results will be uploaded. You can find them by going to https://buildkite.com/organizations/temporal/analytics/suites/temporal-public/runs?branch=all+branches and looking for a recent run with `"job: functional-test"` and the commit hash you used. Here is the buildkite output of the run above: https://buildkite.com/organizations/temporal/analytics/suites/temporal-public/runs/39901afd-35f1-8171-9257-e0bada374824 ## Potential risks Our functional test pipeline could be broken by this PR, but we would notice that pretty immediately ## Documentation How to run it yourself 1. Go to https://github.com/temporalio/temporal/actions/workflows/run-tests.yml 2. Click "run workflow" on the upper right hand side 5. Set Commit SHA to the latest commit on the branch 6. Select your desired options 7. Click the green "Run workflow" button ## Is hotfix candidate? <!-- Is this PR a hotfix candidate or does it require a notification to be sent to the broader community? (Yes/No) -->
What changed?
Add workflow dispatch options to the functional tests Github Action to allow us to run n-iterations of a single functional test with a configurable timeout.
There is also an option to run n-iterations of a single unit test, although it may be faster to run that locally.
WARNING: For functional tests, this will definitely be oomkilled for n>=100, likely for n=>50 too. I suggest to start with n=20 to see how the memory goes and then increase from there. Different DBs may use different amounts of RAM also.
Why?
To aid in the diagnosis and treatment of flaky tests.
How did you test it?
Tested in github actions.
Here is the action run normally, with no new input parameters: https://github.com/temporalio/temporal/actions/runs/11261156403
Here is the action run on one test multiple times: https://github.com/temporalio/temporal/actions/runs/11261236304
While we still have buildkite, the uploaded test results will be uploaded. You can find them by going to https://buildkite.com/organizations/temporal/analytics/suites/temporal-public/runs?branch=all+branches and looking for a recent run with
"job: functional-test"
and the commit hash you used.Here is the buildkite output of the run above: https://buildkite.com/organizations/temporal/analytics/suites/temporal-public/runs/39901afd-35f1-8171-9257-e0bada374824
Potential risks
Our functional test pipeline could be broken by this PR, but we would notice that pretty immediately
Documentation
How to run it yourself
Is hotfix candidate?