-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add script to terminate stale jobs #497
Conversation
Signed-off-by: Rishabh Singh <[email protected]>
//Fetch all builds for the job based on look up time provided | ||
def builds = currentJob.getBuilds().byTimestamp(startMillis,endMillis) | ||
for (build in builds) { | ||
if (build.isBuilding() && currentBuildNumber > build.number && currentBuildDescription == build.description) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we terminating purely based on build description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK when a gradle-check is triggered on a pull request the build description remains the same for each new build of the job. Let me know if I can have a check on any other attribute?
println "Aborted build #${build.number} for ${build.description}" | ||
} | ||
catch (Exception e) { | ||
if (build.result == Result.ABORTED) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add any checks for failed/already stopped builds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the build.isBuilding()
in if condition only iterates over in-progress builds. This try/catch is to catch edge condition where two jobs are executing this block at the same time and then one might fail due to job already terminated by other one. We don't want to fail the build job if this block fails to process due to race condition.
This block just catches this condition without stopping the flow.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #497 +/- ##
=========================================
Coverage 84.32% 84.32%
Complexity 80 80
=========================================
Files 108 108
Lines 523 523
Branches 61 61
=========================================
Hits 441 441
Misses 26 26
Partials 56 56 ☔ View full report in Codecov by Sentry. |
//Fetch all builds for the job based on look up time provided | ||
def builds = currentJob.getBuilds().byTimestamp(startMillis,endMillis) | ||
for (build in builds) { | ||
if (build.isBuilding() && currentBuildNumber > build.number && currentBuildDescription == build.description) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question.
for (build in builds) { | ||
if (build.isBuilding() && currentBuildNumber > build.number && currentBuildDescription == build.description) { | ||
try { | ||
build.doStop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is doStop enough for our case as these might take long to stop.
Do we want to immediately stop it since the agents are deleted anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a graceful stop which also takes care of stopping the underlying running process.
If we use doKill
then it will hard stop the pipeline and there is no guarantee that underlying gradle process would be stopped on the host.
If the agents are deleted for aborted or killed pipelines then yes we can use doKill
instead of stop. Let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes each agent will only run one build, then delete after.
Tho probably the existing graceful stop is enough for the time being.
Co-authored-by: Sayali Gaikawad <[email protected]> Signed-off-by: Rishabh Singh <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rishabh6788 . We will use this method for now and please update the version.
We can think about later if there is a way for github actions to send out a call to Jenkins api to cancel, before they themselves get canceled than using an additional script to clean up jenkins.
Thanks.
Signed-off-by: Rishabh Singh <[email protected]> Signed-off-by: Rishabh Singh <[email protected]> Co-authored-by: Sayali Gaikawad <[email protected]> (cherry picked from commit c37f893) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Description
Add script to terminate stale jobs.
This will help in checking and aborting redundant builds for jobs.
There are no tests due to lack of mocking support for Jenkins instance methods used in the script.
Issues Resolved
opensearch-project/opensearch-build#5008
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.