Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-39194] Implement SCM retry count #147

Merged
merged 11 commits into from
Jan 25, 2018
Merged

Conversation

mmitche
Copy link
Contributor

@mmitche mmitche commented Jun 20, 2017

Honors the retry count of the Jenkins server, and also allows for a custom retry if desired. Only has effect in non-lightweight cases.

@mmitche
Copy link
Contributor Author

mmitche commented Jun 20, 2017

Honors the retry count of the Jenkins server.  Only has effect in non-lightweight cases.
@mmitche
Copy link
Contributor Author

mmitche commented Jun 21, 2017

@jglick I was having trouble with the custom retry. The job config save doesn't go through the descriptor's configure, so ensuring that the jenkins backend doesn't always set the retry count to 0 for a job was impossible.

To unblock for now I just removed the advanced custom retry setting.

@jglick
Copy link
Member

jglick commented Jun 21, 2017

The job config save doesn't go through the descriptor's configure

Not sure what you mean by that.

the jenkins backend doesn't always set the retry count to 0

Sorry, what?

@mmitche
Copy link
Contributor Author

mmitche commented Jun 21, 2017

@jglick So if you look at the way that the config-retry.jelly is implemented, it wants to call hasCustomScmCheckoutRetryCount() on the plugin to determine the checkbox, then binds the value of the text box to scmCheckoutRetryCount. scmCheckoutRetryCount is an Integer which is supposed to be null if the box isn't checked. This is accomplished for AbstractProject by overriding submit() on AbstractBuild and determining whether the checkbox was checked, otherwise it resets scmCheckoutRetryCount to null.

In our case, if you don't implement similar logic, then when you save the job, scmCheckoutRetryCount gets assigned a default value, which ends up being 0, rather than null. The right way to fix this I think would be to override Descriptor's configure method, parse the json like AbstractBuild does, and then set the appropriate values. Saving the job doesn't appear to go through configure() in this case.

Anyways, I could be totally wrong on how to implement the custom retry count UI. I played around with it for a while but wasn't able to get it to work. If you have some suggestions, let me know.

@@ -127,15 +134,38 @@ public boolean isLightweight() {
delegate.setPoll(true);
delegate.setChangelog(true);
try (WorkspaceList.Lease lease = computer.getWorkspaceList().acquire(dir)) {
delegate.checkout(build, dir, listener, node.createLauncher(listener));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -24,7 +24,7 @@ THE SOFTWARE.
-->

<?jelly escape-by-default='true'?>
<j:jelly xmlns:j="jelly:core" xmlns:f="/lib/form" xmlns:st="jelly:stapler">
<j:jelly xmlns:j="jelly:core" xmlns:f="/lib/form" xmlns:st="jelly:stapler" xmlns:p="/lib/hudson/project">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Revert if unused.

@@ -79,6 +82,10 @@ public String getScriptPath() {
public boolean isLightweight() {
return lightweight;
}

public int getScmCheckoutRetryCount() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently used only internally, so better inlined.

}

if (retryCount == 0) // all attempts failed
throw new RunnerAbortedException();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentally I had to rearrange this into something like

for (int x = 0; x <= getScmCheckoutRetryCount(); x++) {
    // …
    script = scriptFile.readToString();
    break;
    // …
    listener.getLogger().println("Retrying after 10 seconds");
    Thread.sleep(10000);
}
if (script == null {
    throw …;
}

to follow the logic, though I think it is right as written.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the same form as AbstractBuild: https://github.com/jenkinsci/jenkins/blob/d13b1361e1650494528a7b46f82fe25d7fb5e3c3/core/src/main/java/hudson/model/AbstractBuild.java#L589

I can change it, just want to make sure there is a good reason to do so.

}

if (retryCount == 0) // all attempts failed
throw new RunnerAbortedException();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use AbortException not RunnerAbortedException. Can probably rearrange the logic a bit so that if we are on the last iteration (the normal case), the original AbortException or whatever is rethrown unmodified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

listener.error(e.getMessage());
}
} catch (InterruptedIOException e) {
throw (InterruptedException)new InterruptedException().initCause(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wrap it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the same form as AbstractBuild: https://github.com/jenkinsci/jenkins/blob/d13b1361e1650494528a7b46f82fe25d7fb5e3c3/core/src/main/java/hudson/model/AbstractBuild.java#L589

I think this is because InterruptedIOException is IOException derived, and InterruptedException is Exception derived. Maybe plain InterruptedException

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this method throws Exception, so there is no need for a wrapper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay sounds good, will remove.

p.setDefinition(def);
// Initially check for the global Jenkins retry count
assertEquals(r.jenkins.getScmCheckoutRetryCount(), def.getScmCheckoutRetryCount());
// Set the jenkins global retry count to 3 and ensure that they are still equal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh, 1?

assertEquals(r.jenkins.getScmCheckoutRetryCount(), def.getScmCheckoutRetryCount());
WorkflowRun b = r.assertBuildStatus(Result.FAILURE, p.scheduleBuild2(0));
r.assertLogContains("Could not read from remote repository", b);
r.assertLogContains("Retrying after 10 seconds", b);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I suppose use waitForMessage, initialize the repository, then assert then the build completes.

@SpencerMalone
Copy link

So what's missing for this to get a merge, besides resolving conflicts?

@mmitche
Copy link
Contributor Author

mmitche commented Sep 1, 2017

Can we get a merge here? It's a big issue in certain cases (like GitHub being flaky)

@mmitche
Copy link
Contributor Author

mmitche commented Sep 18, 2017

@jglick Anything else?

@svanoort svanoort closed this Sep 27, 2017
@svanoort svanoort reopened this Sep 27, 2017
@svanoort
Copy link
Member

Close/re-open to try to retrigger CI

Copy link
Member

@jglick jglick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor issues but basically looks right.

@@ -83,7 +86,7 @@ public boolean isLightweight() {
@DataBoundSetter public void setLightweight(boolean lightweight) {
this.lightweight = lightweight;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW please configure your text editor to not make gratuitous whitespace changes to otherwise unmodified lines.

listener.error(e.getMessage());
}
} catch (InterruptedIOException e) {
throw (InterruptedException)new InterruptedException().initCause(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this method throws Exception, so there is no need for a wrapper.

throw (InterruptedException)new InterruptedException().initCause(e);
} catch (IOException e) {
// checkout error not yet reported
listener.error(e.toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be something like

listener.error("Checkout failed").println(Functions.printThrowable(e).trim()); // TODO 2.43+ use Functions.printStackTrace

}

if (retryCount == 0) // all attempts failed
throw new AbortException();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should specify a message.

def.setLightweight(false);
p.setDefinition(def);
r.jenkins.setScmCheckoutRetryCount(1);
WorkflowRun b = r.assertBuildStatus(Result.FAILURE, p.scheduleBuild2(0));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to extend this test to run a second build, this time using r.waitForMessage("Retrying after 10 seconds", b) and then initializing the repo and then asserting that the build does succeed with the second checkout.

@jglick jglick changed the title Implement SCM retry count [JENKINS-39194] Implement SCM retry count Sep 27, 2017
@svanoort svanoort closed this Sep 27, 2017
@svanoort svanoort reopened this Sep 27, 2017
@svanoort
Copy link
Member

Looks like CI issues?

Installing /home/jenkins/tools/hudson.model.JDK/jdk8/jdk.sh
Can not call launch on a dummy launcher.

@kofwang
Copy link

kofwang commented Dec 1, 2017

Why is this PR not merged?

@abayer
Copy link
Member

abayer commented Jan 12, 2018

@mmitche Can you resolve the merge conflict, please?

@mmitche
Copy link
Contributor Author

mmitche commented Jan 12, 2018

Done

Copy link
Member

@svanoort svanoort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, and thank you for the contribution -- once CI shows green, I'll merge and it'll get shipped with the next release (should be in the next week or two).

@mmitche
Copy link
Contributor Author

mmitche commented Jan 25, 2018

Anything else @svanoort?

@svanoort svanoort merged commit d91d14d into jenkinsci:master Jan 25, 2018
@svanoort
Copy link
Member

Nope, merging now, thanks for the contribution @mmitche (and the reminder now that it is passing)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants