-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor KibanaThreadPoolIT/SystemIndexThreadPoolTestCase for resiliency #108463
Refactor KibanaThreadPoolIT/SystemIndexThreadPoolTestCase for resiliency #108463
Conversation
the in SystemIndexThreadPoolTestCase#testUserThreadPoolsAreBlocked sometimes was blocked instead of throwing an expected exception (due to a queue on a threadpool being full). This commit adds an extra debug & assertion to make sure the queue is indeed full and threadpool is full. relates elastic#107625
Pinging @elastic/es-core-infra (Team:Core/Infra) |
modules/kibana/src/internalClusterTest/java/org/elasticsearch/kibana/KibanaThreadPoolIT.java
Outdated
Show resolved
Hide resolved
modules/kibana/src/internalClusterTest/java/org/elasticsearch/kibana/KibanaThreadPoolIT.java
Outdated
Show resolved
Hide resolved
phaser.register(); | ||
|
||
blockThreadPool(phaser); | ||
phaser.arriveAndAwaitAdvance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is a new version of making sure all threadpools are blocked before the queues are being filled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me the key to fix the test was to block all, waiting for all to be blocked, and only then fill the queues (and proceed with the test), correct?
If so, maybe it's worth a couple of comments (to be honest, method names are already good, but since this was an edge case a comment on that would be a good addition)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm 🥇
* <p>We can block thread pools by setting them to one thread and no queue, then submitting | ||
* threads that wait on a countdown latch. This lets us verify that operations on system indices | ||
* are being directed to other thread pools.</p> | ||
* | ||
* <p>When implementing this class, don't forget to override {@link ESIntegTestCase#nodePlugins()} if | ||
* the relevant system index is defined in a plugin.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Javadoc isn't accurate anymore, better to remove
phaser.register(); | ||
|
||
blockThreadPool(phaser); | ||
phaser.arriveAndAwaitAdvance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch 🎉
@elasticmachine run elasticsearch-ci/part-2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments, but looks good overall!
ThreadPool.Info info = threadPool.info(threadPoolName); | ||
|
||
Runnable waitAction = () -> { | ||
phaser.arriveAndAwaitAdvance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe add a comment to indicate which step each of the arriveAndAwait guards
phaser.register(); | ||
|
||
blockThreadPool(phaser); | ||
phaser.arriveAndAwaitAdvance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me the key to fix the test was to block all, waiting for all to be blocked, and only then fill the queues (and proceed with the test), correct?
If so, maybe it's worth a couple of comments (to be honest, method names are already good, but since this was an edge case a comment on that would be a good addition)
@@ -61,4 +97,108 @@ public void testKibanaThreadPool() { | |||
} | |||
}); | |||
} | |||
|
|||
public void testBlockedThreadPoolsRejectUserRequests() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: is this the right place for this test? This one is not Kibana specific, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, in a way this is a 'negative test case' for what KibanaThreadPool test is doing.
previously it was part of a super class in a testframework, not sure if it is worthy at the moment to refactor it into its own test class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
the in SystemIndexThreadPoolTestCase#testUserThreadPoolsAreBlocked sometimes was blocked instead of throwing an expected exception (due to a queue on a threadpool being full).
IT happens that submitting a busyTask does not guarantee that it will be executed immediately by a threadpool. It might be that some other task was executing at the time
This commit refactors the way threadpool is populated and makes sure that before the queue are filled, all the busyTasks are executed on threadpools
based on the test failure -> #107625 thread pool's threads were busy, but I cannot tell if a queue was full before the search request was submitted.