Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Honor SCM checkout retry count #110

Merged
merged 3 commits into from
Mar 15, 2024
Merged

Conversation

topikachu
Copy link

@topikachu topikachu commented Dec 9, 2022

Honor SCM checkout retry count
Inspired by jenkinsci/workflow-cps-plugin#147

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests - that demonstrates feature works or fixes the issue

@topikachu
Copy link
Author

@jglick Could you take a look?

@jglick jglick requested a review from a team December 13, 2022 00:26
throw new AbortException("Maximum checkout retry attempts reached, aborting");

listener.getLogger().println("Retrying after 10 seconds");
Thread.sleep(10000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not look right to me, should probably do something like what the sleep step does instead of blocking the thread here. Though I am not sure atm how a SynchronousNonBlockingStepExecution is supposed to behave in these situations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, a background thread will sleep. Not terrible, under the assumption this code path is rarely used.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I keep this change? Or, do I need to rewrite the whole StepExecutionImpl?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably good enough as is. This step used SynchronousNonBlockingStepExecution (or the equivalent) from the beginning since the SCM interface in Jenkins does not support durable execution, so unlike

sh 'git clone https://github.com/myorg/myrepo'

a slow checkout cannot survive a controller restart. As of jenkinsci/workflow-basic-steps-plugin#203 this matters less anyway since you can use the new retry system with the nonresumable condition to handle that case.

Comment on lines 151 to 152
if (retryCount == 0) // all attempts failed
throw new AbortException("Maximum checkout retry attempts reached, aborting");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we would throw up the original exception if this is the last attempt, rather than constructing a fresh exception.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

throw new AbortException("Maximum checkout retry attempts reached, aborting");

listener.getLogger().println("Retrying after 10 seconds");
Thread.sleep(10000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably good enough as is. This step used SynchronousNonBlockingStepExecution (or the equivalent) from the beginning since the SCM interface in Jenkins does not support durable execution, so unlike

sh 'git clone https://github.com/myorg/myrepo'

a slow checkout cannot survive a controller restart. As of jenkinsci/workflow-basic-steps-plugin#203 this matters less anyway since you can use the new retry system with the nonresumable condition to handle that case.

@felipecrs
Copy link

@topikachu mind rebasing this PR? I rebased here locally and tests are still passing. Also, the failure in CI was due to some dependency download issue.

@felipecrs
Copy link

felipecrs commented Mar 13, 2024

@rsandell @jglick would you accept this PR as is? I can rebase and send as another (or you can do as well, there's no conflict and tests are passing).

This feature is of extreme importance for me and saves countless hours of rebuilding pipelines that had just failed at the very beginning at cloning due to some intermittency.

I am running this hpi in my Jenkins for a week now and everything seems ok.

@jglick jglick requested a review from a team March 14, 2024 17:23
@jglick jglick added the bug label Mar 14, 2024
@rsandell rsandell merged commit 4ca6512 into jenkinsci:master Mar 15, 2024
14 checks passed
@felipecrs
Copy link

Thanks a ton!

@felipecrs
Copy link

A small note, I noticed the retry functionality is catching the FlowInterruptedException.

It should not, otherwise when we press the Abort button for the build and the build is still cloning, the abort will be ignored, the retry will resume the clone operation, and the build will continue as if it was not aborted.

@jglick
Copy link
Member

jglick commented Mar 27, 2024

break;
} catch (InterruptedIOException e) {
throw e;
} catch (Exception e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block should include:

if (e instanceof FlowInterruptedException && ((FlowInterruptedException)e).isActualInterruption()) {
    throw e;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants