-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent ForkScanner from skipping nodes when scanning incomplete parallel steps that only have a single branch #302
Conversation
src/test/java/org/jenkinsci/plugins/workflow/graphanalysis/ForkScannerTest.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine to start by merging as tests
with @Ignore
.
src/test/java/org/jenkinsci/plugins/workflow/graphanalysis/ForkScannerTest.java
Show resolved
Hide resolved
…e it has less than 2 branches
Fork split(@NonNull HashMap<FlowNode, FlowPiece> nodeMapping, @NonNull BlockStartNode joinPoint, @NonNull FlowPiece joiningBranch) { | ||
int index = visited.lastIndexOf(joinPoint); // Fork will be closer to end, so this is better than indexOf | ||
Fork newFork = new Fork(joinPoint); | ||
|
||
if (index < 0) { | ||
throw new IllegalStateException("Tried to split a segment where the node doesn't exist in this segment"); | ||
} else if (index == this.visited.size()-1) { // We forked just off the most recent node | ||
newFork.following.add(this); | ||
newFork.following.add(joiningBranch); | ||
this.visited.remove(index); | ||
} else if (index == 0) { | ||
throw new IllegalStateException("We have a cyclic graph or heads that are not separate branches!"); | ||
} else { // Splitting at some midpoint within the segment, everything before becomes part of the following | ||
// Execute the split: create a new fork at the fork point, and shuffle the part of the flow after it | ||
// to a new segment and add that to the fork. | ||
|
||
FlowSegment newSegment = new FlowSegment(); | ||
newSegment.after = this.after; | ||
newSegment.visited.addAll(this.visited.subList(0, index)); | ||
newFork.following.add(newSegment); | ||
newFork.following.add(joiningBranch); | ||
this.after = newFork; | ||
this.isLeaf = false; | ||
|
||
// Remove the part before the fork point | ||
this.visited.subList(0, index+1).clear(); | ||
for (FlowNode n : newSegment.visited) { | ||
nodeMapping.put(n, newSegment); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic previously had branches to account for being called in various ways. I changed the way it is called making all but one of the branches unreachable.
@@ -235,7 +235,6 @@ interface FlowPiece { // Mostly a marker | |||
// TODO see if this can be replaced with a FlowChunk acting as a container class for a list of FlowNodes | |||
static class FlowSegment implements FlowPiece { | |||
ArrayList<FlowNode> visited = new ArrayList<>(); | |||
FlowPiece after; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was not ever used apart from test assertions and my simplifications left it always null
, making it completely pointless.
if (isParallelStart(nextBlockStart)) { | ||
// We make a fork for every parallel step preemptively in case it has less than 2 branches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the fix, essentially instead of waiting until we have a piece that needs to be merged with an existing segment, we create the Fork
objects right away as soon as we see a parallel
step.
} else { | ||
throw new IllegalStateException("Unexpected join on " + existingPiece); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be unreachable now, since all merges should hit the above condition 🤞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out this is still reachable in at least some cases, see com.cloudbees.workflow.rest.endpoints.ParallelStepTest.test_stages_order
in pipeline-rest-api
java.lang.IllegalStateException: Unexpected join on org.jenkinsci.plugins.workflow.graphanalysis.ForkScanner$FlowSegment@428b6b34
at org.jenkinsci.plugins.workflow.graphanalysis.ForkScanner.leastCommonAncestor(ForkScanner.java:414)
at org.jenkinsci.plugins.workflow.graphanalysis.ForkScanner.setHeads(ForkScanner.java:453)
at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.setup(AbstractFlowScanner.java:140)
at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.setup(AbstractFlowScanner.java:151)
at org.jenkinsci.plugins.workflow.graphanalysis.ForkScanner.visitSimpleChunks(ForkScanner.java:630)
at com.cloudbees.workflow.rest.external.RunExt.createNew(RunExt.java:323)
at com.cloudbees.workflow.rest.external.RunExt.create(RunExt.java:311)
at com.cloudbees.workflow.rest.external.JobExt.create(JobExt.java:143)
at com.cloudbees.workflow.rest.endpoints.JobAPI.doRuns(JobAPI.java:69)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't been able to reproduce the failure yet. The test in question starts a build that uses sleep
in some parallel branches and just checks it as soon as it has a Run
, so the flow execution is an an unknown and possibly changing state when ForkScanner
runs.
(EDIT: In fact I think the test only passes because it normally checks the branch order before the Pipeline finishes the sleep in branch 2.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally reproduced it, seems to be quite rare (fails about 1 in 40 times for me), so I'm glad it happened to fail in the PCT run!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problematic case seems to be where the parallel step start node is itself a head. That means it gets left out of blockStartIterator
for all heads, and we end up trying to join two heads on whatever segment has the first parent of the parallel step. It seems to be possible to get things working by restoring some of the old cases in FlowSegment.split
, but that seems undesirable, because we end up with a Fork
whose start node is not actually a parallel step, which is bizarre.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a fix for this, but as I add new test assertions I keep finding new things that don't work and need to compare against the previous behavior to understand if they ever worked, so it is slow going.
ForkScanner.Fork forked = mainBranch.split(nodeMap, (BlockStartNode)exec.getNode("4"), sideBranch); | ||
ForkScanner.FlowSegment splitSegment = (ForkScanner.FlowSegment)nodeMap.get(BRANCH1_END); // New branch | ||
Assert.assertNull(splitSegment.after); | ||
FlowTestUtils.assertNodeOrder("Branch 1 split after fork", splitSegment.visited, 9, 8, 6); | ||
|
||
// Just the single node before the fork | ||
Assert.assertEquals(forked, mainBranch.after); | ||
FlowTestUtils.assertNodeOrder("Head of flow, pre-fork", mainBranch.visited, 3); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was using split
directly in a way that is no longer supported and not done by production code, so I tried to adapt the tests.
src/test/java/org/jenkinsci/plugins/workflow/graphanalysis/ForkScannerTest.java
Show resolved
Hide resolved
I am going to test this a bit more, and I want to look into whether we can make the iteration order for incomplete parallel steps match the order of the closures map in the Pipeline script (the order for complete parallel steps already matches the script order). |
This fix has subtle problems in various cases that are difficult to fix comprehensively. For now I am closing this, see #306. |
Looking into more issues along the lines of #287. Note that I did verify that this particular issue occurs both before and after that PR.
Testing done
Submitter checklist