-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Be quieter handling CpsFlowExecution.owner == null
in suspendAll
#788
Conversation
@@ -1638,25 +1638,23 @@ public void pause(final boolean v) throws IOException { | |||
@Restricted(DoNotUse.class) | |||
@Terminator(attains = FlowExecutionList.EXECUTIONS_SUSPENDED) | |||
public static void suspendAll() { | |||
CpsFlowExecution exec = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outermost catch
clause seemed redundant. (Timeout.close
does not throw exceptions, so it was not for that.)
try { | ||
if (execution instanceof CpsFlowExecution) { | ||
CpsFlowExecution cpsExec = (CpsFlowExecution)execution; | ||
if (execution instanceof CpsFlowExecution) { | ||
CpsFlowExecution cpsExec = (CpsFlowExecution) execution; | ||
try { | ||
cpsExec.checkAndAbortNonresumableBuild(); | ||
|
||
LOGGER.log(Level.FINE, "waiting to suspend {0}", execution); | ||
exec = (CpsFlowExecution) execution; | ||
// Like waitForSuspension but with a timeout: | ||
if (exec.programPromise != null) { | ||
if (cpsExec.programPromise != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simplifying
@@ -1671,15 +1669,15 @@ public static void suspendAll() { | |||
} | |||
}); | |||
} | |||
cpsExec.getOwner().getListener().getLogger().close(); | |||
if (cpsExec.owner != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
main fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea how this would be reachable? Corruption during resumption or something? Perhaps jenkinsci/workflow-api-plugin#304 has made it possible? If I understand right, in this case FlowExecutionList.runningTasks
contains a FlowExecutionOwner
which successfully returns a FlowExecution
from .get()
, but for which FlowExecution.owner
on the returned object is null, which seems problematic and unexpected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am afraid I do not know—just saw this in a log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose [Workflow]Run.reload
deserializes execution
, but then WorkflowRun.onLoad
fails before calling getExecution
, or getExecution
fails before calling fetchedExecution.onLoad(new Owner(this))
, or unmarshal
fails partway through, etc. Presumably the build was badly corrupted somehow. The point of this PR is just to avoid unnecessary stack traces after that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth though I saw another case of a CpsFlowExecution
with a null
owner
recently, which is why I am wondering if something has changed things:
java.lang.IllegalStateException: List of flow heads unset for CpsFlowExecution[null]
at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.getCurrentHeads(CpsFlowExecution.java:1018)
... insignificant, just a user loading some build page that accessed the heads for an execution ...
I looked at the XML for that execution on disk (based on the thread name) and head
was non-null and pointed to a FlowEndNode
that did exist in workflow/
, so it wasn't obvious what might have gone wrong.
@@ -1671,15 +1669,15 @@ public static void suspendAll() { | |||
} | |||
}); | |||
} | |||
cpsExec.getOwner().getListener().getLogger().close(); | |||
if (cpsExec.owner != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea how this would be reachable? Corruption during resumption or something? Perhaps jenkinsci/workflow-api-plugin#304 has made it possible? If I understand right, in this case FlowExecutionList.runningTasks
contains a FlowExecutionOwner
which successfully returns a FlowExecution
from .get()
, but for which FlowExecution.owner
on the returned object is null, which seems problematic and unexpected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jtnord I believe it is unrelated. |
See #669 (comment). I did in fact observe