Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent ForkScanner from visiting nodes more than once in some cases for in-progress builds with nested parallelism #287

Merged
merged 3 commits into from
Jun 26, 2023

Conversation

dwnusbaum
Copy link
Member

@dwnusbaum dwnusbaum commented Jun 22, 2023

In-progress pipelines with nested parallel steps sometimes end up visiting the outer parallel step start node and all of its parents more than once. In my particular case this breaks a visualization tool that uses ForkScanner.

Testing done

Automated regression test added.

I tested manually against the following plugins to make sure their tests are not affected by the change. I assume though that all uses of ForkScanner in these plugins were subject to the same potential problem, and so this PR may affect them even if their tests do not cover in-progress builds with nested parallelism:

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests - that demonstrates feature works or fixes the issue

@dwnusbaum dwnusbaum changed the title Add test demonstrating that ForkScanner visits parallel step start nodes more than once in some cases Add test demonstrating that ForkScanner visits nodes more than once in some cases for in-progress builds with nested parallelism Jun 22, 2023
SemaphoreStep.waitForStart("outerA/1", b);
SemaphoreStep.waitForStart("innerA/1", b);
ForkScanner scanner = new ForkScanner();
sanityTestIterationAndVisiter(b.getExecution().getCurrentHeads());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, the parallel step start node (5), as well as the stage body start node (4), the stage step start node (3), and the the flow start node (2) were visited twice.

With the right structure you could end up visiting almost the whole Pipeline twice.

@@ -443,8 +444,15 @@ ArrayDeque<ParallelBlockStart> leastCommonAncestor(@NonNull final Set<FlowNode>
throw new IllegalStateException("No least common ancestor found from " + heads);
}

// If we hit issues with the ordering of blocks by depth, apply a sorting to the parallels by depth
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I interpreted this as a kind of TODO comment, and debugging showed that in existing test cases with nested parallelism, the result was ordered from innermost to outermost, but the ordering was opposite for my new test, and making the order stable fixes my test, so...

To be clear, I do not fully understand the specifics of this method or really this class, so there could still be other problems, but the current behavior is pretty bad and the new behavior should at least be better in some cases.

@dwnusbaum dwnusbaum changed the title Add test demonstrating that ForkScanner visits nodes more than once in some cases for in-progress builds with nested parallelism Prevent ForkScanner from visiting nodes more than once in some cases for in-progress builds with nested parallelism Jun 22, 2023
@dwnusbaum dwnusbaum added the bug label Jun 22, 2023
@dwnusbaum dwnusbaum marked this pull request as ready for review June 22, 2023 20:05
@dwnusbaum dwnusbaum requested a review from a team June 22, 2023 20:07
@dwnusbaum dwnusbaum merged commit 05cd837 into jenkinsci:master Jun 26, 2023
@dwnusbaum dwnusbaum deleted the forkscanner-double-visit branch June 26, 2023 13:16
Comment on lines +448 to +454
return convertForksToBlockStarts(parallelForks).stream().sorted((pbs1, pbs2) -> {
if (pbs1.forkStart.getEnclosingBlocks().contains(pbs2.forkStart)) {
return -1;
} else if (pbs2.forkStart.getEnclosingBlocks().contains(pbs1.forkStart)) {
return 1;
}
return 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Comparator.html this is not “consistent with equals”. That is generally harmless when sorting a list (and I suppose also a stream) but I do not think you can reliably implement a topological sort this way. For example in jshell

import java.util.stream.*;
Stream.of("axx a ax".split(" ")).sorted((x, y) -> x.startsWith(y) ? 1 : y.startsWith(x) ? -1 : 0).collect(Collectors.joining(" "))

works (a ax axx) but on the input axx b a ax you get back the same list, whereas you might expect a b ax axx or similar.

Whether there is any observable bug in this function as a result is another question.

Copy link
Member Author

@dwnusbaum dwnusbaum Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yes, I can look into this.

Whether there is any observable bug in this function as a result is another question.

Perhaps, if you have more than 3 levels of nested parallel steps, or if you have multiple parallel step children under a single parallel step, although I wonder if the latter case would work regardless of this change because of other issues.

In most cases though this list should just have one or two elements, which should work ok.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From some quick testing, I have only been able to reproduce cases where the outermost parallel step is the second to last element before the sort, when it needs to always be the final element. Other than that, the pre-sort order is fine. Perhaps a more targeted fix could be made further up in the method, and it seems plausible that with some really unusual Pipelines there is some reachable case that I was not able to trigger, but I am not sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would recommend at least leaving in a code comment warning that this logic would not work for exotic cases which may or may not be reachable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #288.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants