This repository has been archived by the owner on Oct 9, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 53
The status of the AWS batch job should become failed once the retry limit exceeded #291
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #291 +/- ##
==========================================
- Coverage 63.32% 62.27% -1.05%
==========================================
Files 145 145
Lines 9311 11511 +2200
==========================================
+ Hits 5896 7169 +1273
- Misses 2872 3797 +925
- Partials 543 545 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
pingsutw
changed the title
Turn PhaseRetryableFailure into PhaseRetryLimitExceededFailure once the retry limit exceeded
The status of the AWS batch job should become failed once the retry limit exceeded
Oct 17, 2022
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
hamersaw
reviewed
Nov 14, 2022
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
hamersaw
reviewed
Nov 23, 2022
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
hamersaw
approved these changes
Dec 1, 2022
eapolinario
pushed a commit
that referenced
this pull request
Sep 6, 2023
…imit exceeded (#291) * Turn PhaseRetryableFailure into PhaseRetryLimitExceededFailure Signed-off-by: Kevin Su <[email protected]> * nit Signed-off-by: Kevin Su <[email protected]> * update Signed-off-by: Kevin Su <[email protected]> * update test Signed-off-by: Kevin Su <[email protected]> * lint Signed-off-by: Kevin Su <[email protected]> * update Signed-off-by: Kevin Su <[email protected]> * update tests Signed-off-by: Kevin Su <[email protected]> * lint Signed-off-by: Kevin Su <[email protected]> * wip Signed-off-by: Kevin Su <[email protected]> * udpate Signed-off-by: Kevin Su <[email protected]> * address comment Signed-off-by: Kevin Su <[email protected]> * nit Signed-off-by: Kevin Su <[email protected]> * fix tests Signed-off-by: Kevin Su <[email protected]> * nit Signed-off-by: Kevin Su <[email protected]> Signed-off-by: Kevin Su <[email protected]>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Signed-off-by: Kevin Su [email protected]
TL;DR
The task status is always running if the task failure is retryable. In other words, the propeller will keep waiting no matter how often the task is re-run. Take a look at these code. If all the sub-task failed and all the failure is retryable, the state of the task will never become
PhaseWriteToDiscoveryThenFail
. (totalRetryableFailures is equal to minSuccesses here)In this pr, we kept tracking how often the task has retried and counted the total number of sub-tasks that exceeded the retry limit. Stop the task when
totalRunning+totalSuccesses+totalWaitingForResources+totalRetryableFailures-totalRetryLimitExceeded < minSuccesses
is trueType
Are all requirements met?
Complete description
Tracking Issue
flyteorg/flyte#2979
Follow-up issue
NA