Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add test that verifies subsequent tasks are NEVER_ATTEMPTED when previous action is canceled #437

Merged
merged 2 commits into from
Oct 29, 2024

Conversation

YutongLi291
Copy link
Contributor

@YutongLi291 YutongLi291 commented Oct 10, 2024

What was the problem/requirement? (What/Why)

The scheduler doesn’t pipeline more than one task run in a session until the first task run in that session has completed.

We should verify that this behaviour is as expected, and that if the task run is not completed, other tasks should not have been attempted.

What was the solution? (How)

Add an E2E test that verifies that when a task is canceled during its run, the subsequent tasks are in NEVER_ATTEMPTED status.

Also edited an existing test that verifies CANCELED actions that other subsequent actions are NEVER_ATTEMPTED

What is the impact of this change?

Better testing to verify worker behaviour is as expected when task is canceled, that following tasks are NEVER_ATTEMPTED

How was this change tested?

# Linux
source .e2e_linux_infra.sh
hatch run e2e-test

# Windows
source .e2e_windows_infra.sh
hatch run e2e-test

Was this change documented?

No

Is this a breaking change?

No

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@YutongLi291 YutongLi291 requested a review from a team as a code owner October 10, 2024 21:41
@YutongLi291 YutongLi291 changed the title test: add test that verifies subsequent tasks are NEVER_ATTEMPTED when task is canceled test: add test that verifies subsequent tasks are NEVER_ATTEMPTED when previous action is canceled Oct 11, 2024
test/e2e/test_job_submissions.py Fixed Show resolved Hide resolved
@YutongLi291 YutongLi291 requested a review from leongdl October 21, 2024 16:52
@YutongLi291 YutongLi291 enabled auto-merge (squash) October 24, 2024 00:31
assert is_job_started_with_sessions(job)

# Wait some time for the second step (which sleeps for 2 minutes) to start
time.sleep(20)
Copy link

@hindleym hindleym Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of sleeping here, it may be worth just waiting on the second step to switch to RUNNING? That way if it starts sooner, the test can complete faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, but that's also more API calls. There's a tradeoff, and I believe that 20 seconds is not a big time wait to justify the extraneous API calls.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the sleep command more as a last resort. It guarantees to slow down all testing, especially when testing new changes before PR. Between these two sleeps, you're adding thirty seconds to every full test attempt, which already takes an extremely long time to complete. If I'm not mistaken, most people make sure their changes work with the full suite more than once before creating a PR, that's at least a minute to every future change people try to merge.

@@ -406,6 +406,9 @@ def sessions_exist(current_job: Job) -> bool:

LOG.info(f"Job result: {job}")

# Wait until the envExit runs as well
time.sleep(10)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is anything actually happening on exit that we need to sleep this long? I see that it waits for the job to complete above. Should that function not already make sure that the job has finished it's calls before returning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The job is considered completed even without running the envExit and envExit is not required to finish for the job to finish.

Copy link

@hindleym hindleym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid forcing tests to sleep as much as possible.

@YutongLi291 YutongLi291 requested a review from hindleym October 26, 2024 20:54
auto-merge was automatically disabled October 28, 2024 21:22

Pull request was closed

…n previous action is canceled

Signed-off-by: Yutong Li <[email protected]>
@YutongLi291 YutongLi291 reopened this Oct 28, 2024
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
11.0% Duplication on New Code (required ≤ 3%)

See analysis details on SonarCloud

@YutongLi291 YutongLi291 enabled auto-merge (squash) October 29, 2024 02:54
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
11.0% Duplication on New Code (required ≤ 3%)

See analysis details on SonarCloud

@YutongLi291 YutongLi291 merged commit add5dce into aws-deadline:mainline Oct 29, 2024
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants