Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Action inputs to dispatch n-runs of a single test in CI #6297

Merged
merged 81 commits into from
Oct 9, 2024

Conversation

carlydf
Copy link
Contributor

@carlydf carlydf commented Jul 16, 2024

What changed?

Add workflow dispatch options to the functional tests Github Action to allow us to run n-iterations of a single functional test with a configurable timeout.

There is also an option to run n-iterations of a single unit test, although it may be faster to run that locally.

WARNING: For functional tests, this will definitely be oomkilled for n>=100, likely for n=>50 too. I suggest to start with n=20 to see how the memory goes and then increase from there. Different DBs may use different amounts of RAM also.

Why?

To aid in the diagnosis and treatment of flaky tests.

How did you test it?

Tested in github actions.
Here is the action run normally, with no new input parameters: https://github.com/temporalio/temporal/actions/runs/11261156403
Here is the action run on one test multiple times: https://github.com/temporalio/temporal/actions/runs/11261236304

While we still have buildkite, the uploaded test results will be uploaded. You can find them by going to https://buildkite.com/organizations/temporal/analytics/suites/temporal-public/runs?branch=all+branches and looking for a recent run with "job: functional-test" and the commit hash you used.
Here is the buildkite output of the run above: https://buildkite.com/organizations/temporal/analytics/suites/temporal-public/runs/39901afd-35f1-8171-9257-e0bada374824

Potential risks

Our functional test pipeline could be broken by this PR, but we would notice that pretty immediately

Documentation

How to run it yourself

  1. Go to https://github.com/temporalio/temporal/actions/workflows/run-tests.yml
  2. Click "run workflow" on the upper right hand side
  3. Set Commit SHA to the latest commit on the branch
  4. Select your desired options
  5. Click the green "Run workflow" button

Is hotfix candidate?

@rodrigozhou
Copy link
Contributor

Can we create a separate workflow file for this purpose?
run-tests.yml is complicated enough that adding this seems added complexity unnecessarily.
Also, as you noted, running misc-checks is also unnecessary when running a single test.

@dnr
Copy link
Member

dnr commented Jul 20, 2024

Can we create a separate workflow file for this purpose? run-tests.yml is complicated enough that adding this seems added complexity unnecessarily. Also, as you noted, running misc-checks is also unnecessary when running a single test.

A separate workflow file that we run only rarely will rot and be likely to be broken when someone wants to use it. Integrating it into the main one makes it much more likely to be maintained and working

@carlydf carlydf changed the title Action inputs to dispatch n-runs of a single functional test Action inputs to dispatch n-runs of a single test in CI Oct 4, 2024
@carlydf carlydf enabled auto-merge (squash) October 9, 2024 19:06
@carlydf carlydf merged commit 65a58d9 into main Oct 9, 2024
59 of 60 checks passed
@carlydf carlydf deleted the cdf/rerun-functional-test branch October 9, 2024 19:17
xwduan pushed a commit that referenced this pull request Oct 18, 2024
## What changed?
Add workflow dispatch options to the functional tests Github Action to
allow us to run n-iterations of a single functional test with a
configurable timeout.

There is also an option to run n-iterations of a single unit test,
although it may be faster to run that locally.

WARNING: For functional tests, this will definitely be oomkilled for
n>=100, likely for n=>50 too. I suggest to start with n=20 to see how
the memory goes and then increase from there. Different DBs may use
different amounts of RAM also.

## Why?
To aid in the diagnosis and treatment of flaky tests.

## How did you test it?
Tested in github actions.
Here is the action run normally, with no new input parameters:
https://github.com/temporalio/temporal/actions/runs/11261156403
Here is the action run on one test multiple times:
https://github.com/temporalio/temporal/actions/runs/11261236304

While we still have buildkite, the uploaded test results will be
uploaded. You can find them by going to
https://buildkite.com/organizations/temporal/analytics/suites/temporal-public/runs?branch=all+branches
and looking for a recent run with `"job: functional-test"` and the
commit hash you used.
Here is the buildkite output of the run above:
https://buildkite.com/organizations/temporal/analytics/suites/temporal-public/runs/39901afd-35f1-8171-9257-e0bada374824

## Potential risks
Our functional test pipeline could be broken by this PR, but we would
notice that pretty immediately

## Documentation
How to run it yourself
1. Go to
https://github.com/temporalio/temporal/actions/workflows/run-tests.yml
2. Click "run workflow" on the upper right hand side
5. Set Commit SHA to the latest commit on the branch
6. Select your desired options
7. Click the green "Run workflow" button

## Is hotfix candidate?
<!-- Is this PR a hotfix candidate or does it require a notification to
be sent to the broader community? (Yes/No) -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants