Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent ForkScanner from skipping nodes when scanning incomplete parallel steps that only have a single branch #302

Closed

Conversation

dwnusbaum
Copy link
Member

@dwnusbaum dwnusbaum commented Aug 24, 2023

Looking into more issues along the lines of #287. Note that I did verify that this particular issue occurs both before and after that PR.

Testing done

Submitter checklist

Preview Give feedback

Copy link
Member

@jglick jglick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine to start by merging as tests with @Ignore.

@dwnusbaum dwnusbaum changed the title Add test demonstrating that ForkScanner skips outer parallel branches in some cases Prevent ForkScanner from skipping nodes when scanning incomplete parallel branches that only have a single branch Aug 24, 2023
Comment on lines -255 to -283
Fork split(@NonNull HashMap<FlowNode, FlowPiece> nodeMapping, @NonNull BlockStartNode joinPoint, @NonNull FlowPiece joiningBranch) {
int index = visited.lastIndexOf(joinPoint); // Fork will be closer to end, so this is better than indexOf
Fork newFork = new Fork(joinPoint);

if (index < 0) {
throw new IllegalStateException("Tried to split a segment where the node doesn't exist in this segment");
} else if (index == this.visited.size()-1) { // We forked just off the most recent node
newFork.following.add(this);
newFork.following.add(joiningBranch);
this.visited.remove(index);
} else if (index == 0) {
throw new IllegalStateException("We have a cyclic graph or heads that are not separate branches!");
} else { // Splitting at some midpoint within the segment, everything before becomes part of the following
// Execute the split: create a new fork at the fork point, and shuffle the part of the flow after it
// to a new segment and add that to the fork.

FlowSegment newSegment = new FlowSegment();
newSegment.after = this.after;
newSegment.visited.addAll(this.visited.subList(0, index));
newFork.following.add(newSegment);
newFork.following.add(joiningBranch);
this.after = newFork;
this.isLeaf = false;

// Remove the part before the fork point
this.visited.subList(0, index+1).clear();
for (FlowNode n : newSegment.visited) {
nodeMapping.put(n, newSegment);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic previously had branches to account for being called in various ways. I changed the way it is called making all but one of the branches unreachable.

@@ -235,7 +235,6 @@ interface FlowPiece { // Mostly a marker
// TODO see if this can be replaced with a FlowChunk acting as a container class for a list of FlowNodes
static class FlowSegment implements FlowPiece {
ArrayList<FlowNode> visited = new ArrayList<>();
FlowPiece after;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was not ever used apart from test assertions and my simplifications left it always null, making it completely pointless.

Comment on lines +385 to +386
if (isParallelStart(nextBlockStart)) {
// We make a fork for every parallel step preemptively in case it has less than 2 branches.
Copy link
Member Author

@dwnusbaum dwnusbaum Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix, essentially instead of waiting until we have a piece that needs to be merged with an existing segment, we create the Fork objects right away as soon as we see a parallel step.

Comment on lines +407 to +408
} else {
throw new IllegalStateException("Unexpected join on " + existingPiece);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be unreachable now, since all merges should hit the above condition 🤞

Copy link
Member Author

@dwnusbaum dwnusbaum Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out this is still reachable in at least some cases, see com.cloudbees.workflow.rest.endpoints.ParallelStepTest.test_stages_order in pipeline-rest-api

java.lang.IllegalStateException: Unexpected join on org.jenkinsci.plugins.workflow.graphanalysis.ForkScanner$FlowSegment@428b6b34
	at org.jenkinsci.plugins.workflow.graphanalysis.ForkScanner.leastCommonAncestor(ForkScanner.java:414)
	at org.jenkinsci.plugins.workflow.graphanalysis.ForkScanner.setHeads(ForkScanner.java:453)
	at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.setup(AbstractFlowScanner.java:140)
	at org.jenkinsci.plugins.workflow.graphanalysis.AbstractFlowScanner.setup(AbstractFlowScanner.java:151)
	at org.jenkinsci.plugins.workflow.graphanalysis.ForkScanner.visitSimpleChunks(ForkScanner.java:630)
	at com.cloudbees.workflow.rest.external.RunExt.createNew(RunExt.java:323)
	at com.cloudbees.workflow.rest.external.RunExt.create(RunExt.java:311)
	at com.cloudbees.workflow.rest.external.JobExt.create(JobExt.java:143)
	at com.cloudbees.workflow.rest.endpoints.JobAPI.doRuns(JobAPI.java:69)

Copy link
Member Author

@dwnusbaum dwnusbaum Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been able to reproduce the failure yet. The test in question starts a build that uses sleep in some parallel branches and just checks it as soon as it has a Run, so the flow execution is an an unknown and possibly changing state when ForkScanner runs.

(EDIT: In fact I think the test only passes because it normally checks the branch order before the Pipeline finishes the sleep in branch 2.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally reproduced it, seems to be quite rare (fails about 1 in 40 times for me), so I'm glad it happened to fail in the PCT run!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problematic case seems to be where the parallel step start node is itself a head. That means it gets left out of blockStartIterator for all heads, and we end up trying to join two heads on whatever segment has the first parent of the parallel step. It seems to be possible to get things working by restoring some of the old cases in FlowSegment.split, but that seems undesirable, because we end up with a Fork whose start node is not actually a parallel step, which is bizarre.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a fix for this, but as I add new test assertions I keep finding new things that don't work and need to compare against the previous behavior to understand if they ever worked, so it is slow going.

Comment on lines -375 to -383
ForkScanner.Fork forked = mainBranch.split(nodeMap, (BlockStartNode)exec.getNode("4"), sideBranch);
ForkScanner.FlowSegment splitSegment = (ForkScanner.FlowSegment)nodeMap.get(BRANCH1_END); // New branch
Assert.assertNull(splitSegment.after);
FlowTestUtils.assertNodeOrder("Branch 1 split after fork", splitSegment.visited, 9, 8, 6);

// Just the single node before the fork
Assert.assertEquals(forked, mainBranch.after);
FlowTestUtils.assertNodeOrder("Head of flow, pre-fork", mainBranch.visited, 3);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was using split directly in a way that is no longer supported and not done by production code, so I tried to adapt the tests.

@dwnusbaum
Copy link
Member Author

I am going to test this a bit more, and I want to look into whether we can make the iteration order for incomplete parallel steps match the order of the closures map in the Pipeline script (the order for complete parallel steps already matches the script order).

@dwnusbaum dwnusbaum added the bug label Aug 25, 2023
@dwnusbaum dwnusbaum changed the title Prevent ForkScanner from skipping nodes when scanning incomplete parallel branches that only have a single branch Prevent ForkScanner from skipping nodes when scanning incomplete parallel steps that only have a single branch Aug 25, 2023
@dwnusbaum
Copy link
Member Author

This fix has subtle problems in various cases that are difficult to fix comprehensively. For now I am closing this, see #306.

@dwnusbaum dwnusbaum closed this Aug 29, 2023
@dwnusbaum dwnusbaum deleted the more-forkscanner-problems branch August 29, 2023 20:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants