-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JENKINS-39194] Implement SCM retry count #147
Conversation
Honors the retry count of the Jenkins server. Only has effect in non-lightweight cases.
@jglick I was having trouble with the custom retry. The job config save doesn't go through the descriptor's configure, so ensuring that the jenkins backend doesn't always set the retry count to 0 for a job was impossible. To unblock for now I just removed the advanced custom retry setting. |
Not sure what you mean by that.
Sorry, what? |
@jglick So if you look at the way that the config-retry.jelly is implemented, it wants to call hasCustomScmCheckoutRetryCount() on the plugin to determine the checkbox, then binds the value of the text box to scmCheckoutRetryCount. scmCheckoutRetryCount is an Integer which is supposed to be null if the box isn't checked. This is accomplished for AbstractProject by overriding submit() on AbstractBuild and determining whether the checkbox was checked, otherwise it resets scmCheckoutRetryCount to null. In our case, if you don't implement similar logic, then when you save the job, scmCheckoutRetryCount gets assigned a default value, which ends up being 0, rather than null. The right way to fix this I think would be to override Descriptor's configure method, parse the json like AbstractBuild does, and then set the appropriate values. Saving the job doesn't appear to go through configure() in this case. Anyways, I could be totally wrong on how to implement the custom retry count UI. I played around with it for a while but wasn't able to get it to work. If you have some suggestions, let me know. |
@@ -127,15 +134,38 @@ public boolean isLightweight() { | |||
delegate.setPoll(true); | |||
delegate.setChangelog(true); | |||
try (WorkspaceList.Lease lease = computer.getWorkspaceList().acquire(dir)) { | |||
delegate.checkout(build, dir, listener, node.createLauncher(listener)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -24,7 +24,7 @@ THE SOFTWARE. | |||
--> | |||
|
|||
<?jelly escape-by-default='true'?> | |||
<j:jelly xmlns:j="jelly:core" xmlns:f="/lib/form" xmlns:st="jelly:stapler"> | |||
<j:jelly xmlns:j="jelly:core" xmlns:f="/lib/form" xmlns:st="jelly:stapler" xmlns:p="/lib/hudson/project"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Revert if unused.
@@ -79,6 +82,10 @@ public String getScriptPath() { | |||
public boolean isLightweight() { | |||
return lightweight; | |||
} | |||
|
|||
public int getScmCheckoutRetryCount() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently used only internally, so better inlined.
} | ||
|
||
if (retryCount == 0) // all attempts failed | ||
throw new RunnerAbortedException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentally I had to rearrange this into something like
for (int x = 0; x <= getScmCheckoutRetryCount(); x++) {
// …
script = scriptFile.readToString();
break;
// …
listener.getLogger().println("Retrying after 10 seconds");
Thread.sleep(10000);
}
if (script == null {
throw …;
}
to follow the logic, though I think it is right as written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed the same form as AbstractBuild: https://github.com/jenkinsci/jenkins/blob/d13b1361e1650494528a7b46f82fe25d7fb5e3c3/core/src/main/java/hudson/model/AbstractBuild.java#L589
I can change it, just want to make sure there is a good reason to do so.
} | ||
|
||
if (retryCount == 0) // all attempts failed | ||
throw new RunnerAbortedException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use AbortException
not RunnerAbortedException
. Can probably rearrange the logic a bit so that if we are on the last iteration (the normal case), the original AbortException
or whatever is rethrown unmodified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
listener.error(e.getMessage()); | ||
} | ||
} catch (InterruptedIOException e) { | ||
throw (InterruptedException)new InterruptedException().initCause(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wrap it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed the same form as AbstractBuild: https://github.com/jenkinsci/jenkins/blob/d13b1361e1650494528a7b46f82fe25d7fb5e3c3/core/src/main/java/hudson/model/AbstractBuild.java#L589
I think this is because InterruptedIOException is IOException derived, and InterruptedException is Exception derived. Maybe plain InterruptedException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this method throws Exception
, so there is no need for a wrapper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay sounds good, will remove.
p.setDefinition(def); | ||
// Initially check for the global Jenkins retry count | ||
assertEquals(r.jenkins.getScmCheckoutRetryCount(), def.getScmCheckoutRetryCount()); | ||
// Set the jenkins global retry count to 3 and ensure that they are still equal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh, 1?
assertEquals(r.jenkins.getScmCheckoutRetryCount(), def.getScmCheckoutRetryCount()); | ||
WorkflowRun b = r.assertBuildStatus(Result.FAILURE, p.scheduleBuild2(0)); | ||
r.assertLogContains("Could not read from remote repository", b); | ||
r.assertLogContains("Retrying after 10 seconds", b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could I suppose use waitForMessage
, initialize the repository, then assert then the build completes.
So what's missing for this to get a merge, besides resolving conflicts? |
Can we get a merge here? It's a big issue in certain cases (like GitHub being flaky) |
@jglick Anything else? |
Close/re-open to try to retrigger CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor issues but basically looks right.
@@ -83,7 +86,7 @@ public boolean isLightweight() { | |||
@DataBoundSetter public void setLightweight(boolean lightweight) { | |||
this.lightweight = lightweight; | |||
} | |||
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW please configure your text editor to not make gratuitous whitespace changes to otherwise unmodified lines.
listener.error(e.getMessage()); | ||
} | ||
} catch (InterruptedIOException e) { | ||
throw (InterruptedException)new InterruptedException().initCause(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this method throws Exception
, so there is no need for a wrapper.
throw (InterruptedException)new InterruptedException().initCause(e); | ||
} catch (IOException e) { | ||
// checkout error not yet reported | ||
listener.error(e.toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be something like
listener.error("Checkout failed").println(Functions.printThrowable(e).trim()); // TODO 2.43+ use Functions.printStackTrace
} | ||
|
||
if (retryCount == 0) // all attempts failed | ||
throw new AbortException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should specify a message.
def.setLightweight(false); | ||
p.setDefinition(def); | ||
r.jenkins.setScmCheckoutRetryCount(1); | ||
WorkflowRun b = r.assertBuildStatus(Result.FAILURE, p.scheduleBuild2(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to extend this test to run a second build, this time using r.waitForMessage("Retrying after 10 seconds", b)
and then initializing the repo and then asserting that the build does succeed with the second checkout.
Looks like CI issues?
|
Why is this PR not merged? |
@mmitche Can you resolve the merge conflict, please? |
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me, and thank you for the contribution -- once CI shows green, I'll merge and it'll get shipped with the next release (should be in the next week or two).
Anything else @svanoort? |
Nope, merging now, thanks for the contribution @mmitche (and the reminder now that it is passing) |
Honors the retry count of the Jenkins server, and also allows for a custom retry if desired. Only has effect in non-lightweight cases.