-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure discovered descriptor graphs are acyclic #1451
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1451 +/- ##
============================================
- Coverage 91.77% 91.77% -0.01%
- Complexity 3202 3208 +6
============================================
Files 295 296 +1
Lines 7710 7729 +19
Branches 654 658 +4
============================================
+ Hits 7076 7093 +17
- Misses 472 475 +3
+ Partials 162 161 -1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good to me! I love that you caught out that cycles may happen, so well done on that!
Just three minor comments which I think should be addressed, then as an external observer I think this PR would be good to go. See below. :)
static boolean isAcyclic(TestDescriptor root) { | ||
Set<UniqueId> visited = new HashSet<>(); | ||
visited.add(root.getUniqueId()); | ||
Stack<TestDescriptor> stack = new Stack<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit: Could be Deque<TestDescriptor> stack = new ArrayDeque<>
to increase performance slightly. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any link to a proof? Are we talking 10%, 1% or 0.001% here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any link to a proof?
Yep! In the Java 8 API docs for ArrayDeque
, the class javadocs say,
This class is likely to be faster than Stack when used as a stack, and faster than LinkedList when used as a queue.
Are we talking 10%, 1% or 0.001% here?
I don't know, actually! And I don't actually know how to use JMH or another benchmarking tool to show how much... My bad. :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hehe, okay. Taking the likely as solid hint. (-:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you were discussing Stack vs Array, let me add my one cent. Stack is actually a subclass of Vector which is synchronized and in essence is part of the 'legacy' classes preexisting the java 2 introduced collections framework. In Vector's javadoc we can read:
As of the Java 2 platform v1.2, this class was retrofitted to implement the List interface, making it a member of the Java Collections Framework. Unlike the new collection implementations, Vector is synchronized. If a thread-safe implementation is not needed, it is recommended to use ArrayList in place of Vector.
Also in Stack's javadoc we can see another hint that Stack is legacy, reading:
A more complete and consistent set of LIFO stack operations is provided by the Deque interface and its implementations, which should be used in preference to this class.
On the other hand when we want a stack
it is natural to use Stack
... why would someone want to use a Deque
..? It makes no sense... 😳
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gaganis I know, it's very weird and the complete opposite of intuitive. This is one of the many drawbacks of Java's policy on maintaining backwards compatibility as much as possible.
I don't know why they originally implemented Stack by extending Vector all those years ago, or why they made it a concrete class instead of an interface which would have allowed alternative stack implementations. However, as I think you've noticed, due to those historical decisions and how Stack consequently uses synchronized
internally, Stack does perform not as well as ArrayDeque as a stack in single-thread situations.
As for why we're only provided with a Deque
interface and not also a super-interface that provides just push()
and pop()
, I've no idea as to why it was done that way. Seems a bit silly to me, personally.
...Not sure what else to say. :)
@@ -69,6 +70,27 @@ | |||
return testEngines; | |||
} | |||
|
|||
/** | |||
* @return {@code true} if the graph does _not_ contain a cycle; else {@code false}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the _not_
should be replaced with <em>not</em>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
} | ||
} | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sormuras I'm not clear on what algorithm isAcyclic()
uses to check for cycles. Can you place a comment towards the top of the body so that it's clear, so something like the following?
static boolean isAcyclic() {
// <name-of-algorithm-here>
. . .
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, never invented a name for some lines of code before. stack-based-traversing-with-bread-crumb-checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha!
I was hoping that your algorithm had an official name like Kahn's algorithm for topological sorting or Tarjan's algorithm for finding strongly-connected components, as I understand from these StackOverflow answers that those algorithms can be used to find cycles in graphs.
If alternatively your algorithm is home-grown and it was inspired by an existing graph traversal algorithm like depth-first search or breadth-first search, then it would help clarify things (for me, at least) if that was noted down somewhere, along with what change(s) were made to the traversal algorithm to allow cycles to be detected.
But maybe I'm asking for way too much here, in which case I'd be happy to slowly read through the code in my own time later today to figure out what it's doing. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we don't actually have a directed graph. Rather, we have a tree.
So we should probably change the title of the issue and PR to avoid confusion. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And that's a breadth-first traversal of the tree of test descriptors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but I'd still refer to it as a tree in this context. 😇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And that's a breadth-first traversal of the tree of test descriptors.
@sbrannen Cool, thank you for the clarification/confirmation! 😀👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah cool, rereading the code with the knowledge that it uses breadth-first search, I understand what it's doing now: it's using a stack
and continuous calls to stack.pop().getChildren()
to iterate over the entire tree in breath-first order, and it keeps track of visited
nodes to ensure there are no cycles. Got it! 👍
I'd personally love it if a minor comment were placed at the top of the body like // based on breath-first search
, but I won't insist. 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if @sormuras describes it as "breath-first [sic]", I would find that totally amusing...
Take a deep breath first... and then dive deep! 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha! facepalm
This is what I get for typing things with my phone. :P
16dedb9
to
2ba57d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for noticing and improving, @sormuras!
return false; // id already known: cycle detected! | ||
} | ||
if (child.isContainer()) { | ||
stack.add(child); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using pop()
and add()
will reverse the iteration order. Shouldn't hurt, but might make debugging harder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcphillip Your saying that has prompted me to think more, and I think you're almost right; AFAICT, it will actually cause the tree to be traversed in depth-first order rather than as well as reversing the iteration order. As you said, it wouldn't hurt (except for debugging), but I think swapping the stack out for a Queue<> queue = new ArrayDeque<>()
should fix things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You‘re right, it‘s indeed depth-first. However since we‘re visiting all descriptors anyway it doesn’t really matter. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Queue
. So be it.
@@ -132,6 +155,9 @@ private Root discoverRoot(LauncherDiscoveryRequest discoveryRequest, String phas | |||
() -> String.format( | |||
"The discover() method for TestEngine with ID '%s' must return a non-null root TestDescriptor.", | |||
testEngine.getId())); | |||
Preconditions.condition(isAcyclic(engineRoot), | |||
() -> String.format("The discover() method for TestEngine with ID '%s' returned a cyclic graph.", | |||
testEngine.getId())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should extract these checks into a separate EngineDiscoveryResultValidator
class or sth. similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move also the not-null
-check?
Prior this commit only a null-check was performed by the DefaultLauncher against the root test descriptor returned by an engine. Other properties where not checked, especially if the graph of that root descriptor contained cycles it could lead to stack overflow errors when traversing the graph for execution. This commit addresses this issue by explicitly checking the returned graph of the root test descriptor for cycles. If a cycle is detected a PreconditionViolationException is thrown. Fixes: #1447
2ba57d1
to
9f662e5
Compare
Overview
Prior this commit only a null-check was performed by the
DefaultLauncher
against the root test descriptor returned by an engine. Other properties where not checked, especially if the graph of that root descriptor contained cycles it could lead to stack overflow errors when traversingthe graph for execution.
This commit addresses this issue by explicitly checking the returned graph of the root test descriptor for cycles. If a cycle is detected a
PreconditionViolationException
is thrown.Fixes: #1447