-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve master service batching queues #92021
Improve master service batching queues #92021
Conversation
Today the master's pending task queue is just the `PriorityBlockingQueue<Runnable>` belonging to the underlying `ThreadPoolExecutor`. The reasons for this date back a long way but it doesn't really reflect the structure of the queue as it exists today. In particular, we must keep track of batches independently of the queue itself, and must do various bits of unchecked casting to process multiple items of the same type at once. This commit introduces an new queueing mechanism, independent of the executor's queue, which better represents the conceptual structure of the master's pending tasks: * Today we use a priority queue to allow important tasks to preempt less-important ones. However there are only a small number of priority levels, so it is simpler to maintain a queue for each priority, effectively replacing the sorting within the priority queue with a radix sort. * Today when a task is submitted we perform a map lookup to see if it can be added to an existing batch or not. With this change we allow client code to create its own dedicated queue of tasks. The entries in the per-priority-level queues are themselves queues, one for each executor, representing the batches to be run. * Today each task in the queue holds a reference to its executor, but the executor used to run a task may belong to a different task in the same batch. In practice we know they're the same executor (that's how batches are defined) but we cannot express this knowledge in the type system so we have to do a bunch of unchecked casting to work around it. With this change we associate each per-executor queue directly with its executor, avoiding the need to do all this unchecked casting. * Today the master service must block its thread while waiting for each task to complete, because otherwise the executor would start to process the next task in the queue. This makes testing using a `DeterministicTaskQueue` harder (see `FakeThreadPoolMasterService`). This change avoids enqueueing tasks on the `ThreadPoolExecutor` unless there is genuinely work to do, although it leaves the removal of the actual blocking to a followup.
Hi @DaveCTurner, I've created a changelog YAML for you. |
Pinging @elastic/es-distributed (Team:Distributed) |
Hi @DaveCTurner, I've updated the changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The meaningful part of this change is in MasterService
and its tests.
Almost all of the other changes here are mechanically replacing clusterService.submitStateUpdateTask
with taskQueue.submitTask
on a suitably-constructed taskQueue
.
It's (slightly) nicer if you ignore whitespace changes.
I've left some other notes for reviewers inline.
server/src/internalClusterTest/java/org/elasticsearch/cluster/service/ClusterServiceIT.java
Outdated
Show resolved
Hide resolved
|| e instanceof FailedToCommitClusterStateException | ||
&& e.getCause()instanceof EsRejectedExecutionException esre | ||
&& esre.isExecutorShutdown(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today's MasterService
silently drops stuff when it's shut down. With this change we become stricter about rejecting work explicitly in this case.
@@ -46,12 +45,10 @@ public PrioritizedEsThreadPoolExecutor( | |||
TimeUnit unit, | |||
ThreadFactory threadFactory, | |||
ThreadContext contextHolder, | |||
ScheduledExecutorService timer, | |||
StarvationWatcher starvationWatcher |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The starvation-watching mechanism is now built into the MasterService
itself, not the executor, so this parameter is unnecessary and removed everywhere.
@elasticmachine update branch |
@elasticmachine update branch |
Deferring this until after elastic#92021 when there will be more typesafety and better tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did an initial read of the MasterService
and wanted to relay my comments now to ensure alignment before going in details and reading tests.
Overall, this change looks great.
server/src/main/java/org/elasticsearch/cluster/service/ClusterService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/service/MasterService.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/service/MasterService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/service/MasterService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/service/MasterService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/service/MasterService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/service/MasterService.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/service/MasterService.java
Show resolved
Hide resolved
The changes introduced in elastic#92021 mean that there is no need for the master service to block its thread while waiting for each publication to complete. This commit removes the now-unnecessary blocking. This commit also removes the now-unnecessary fake blocking in the `FakeThreadPoolMasterService` used in tests, bringing the implementation covered by tests closer to the production implementation.
The changes introduced in elastic#92021 mean that the master service no longer needs to use a prioritized executor. Prioritized executors are weird, for instance they don't propagate rejections to `AbstractRunnable` tasks properly. This commit moves to using a regular scaling executor and removes some of the now-unnecessary workarounds for handling the prioritized executor's weirdness.
The changes introduced in elastic#92021 mean that the master service no longer needs to use a prioritized executor. Prioritized executors are weird, for instance they don't propagate rejections to `AbstractRunnable` tasks properly. This commit moves to using a regular scaling executor and removes some of the now-unnecessary workarounds for handling the prioritized executor's weirdness.
The changes introduced in #92021 mean that the master service no longer needs to use a prioritized executor. Prioritized executors are weird, for instance they don't propagate rejections to `AbstractRunnable` tasks properly. This commit moves to using a regular scaling executor and removes some of the now-unnecessary workarounds for handling the prioritized executor's weirdness.
The changes introduced in elastic#92021 mean that there is no need for the master service to block its thread while waiting for each publication to complete. This commit removes the now-unnecessary blocking. This commit also removes the now-unnecessary fake blocking in the `FakeThreadPoolMasterService` used in tests, bringing the implementation covered by tests closer to the production implementation.
The changes introduced in #92021 mean that there is no need for the master service to block its thread while waiting for each publication to complete. This commit removes the now-unnecessary blocking. This commit also removes the now-unnecessary fake blocking in the `FakeThreadPoolMasterService` used in tests, bringing the implementation covered by tests closer to the production implementation.
Today the master's pending task queue is just the
PriorityBlockingQueue<Runnable>
belonging to the underlyingThreadPoolExecutor
. The reasons for this date back a long way but it doesn't really reflect the structure of the queue as it exists today. In particular, we must keep track of batches independently of the queue itself, and must do various bits of unchecked casting to process multiple items of the same type at once.This commit introduces an new queueing mechanism, independent of the executor's queue, which better represents the conceptual structure of the master's pending tasks:
Today we use a priority queue to allow important tasks to preempt less-important ones. However there are only a small number of priority levels, so it is simpler to maintain a queue for each priority, effectively replacing the sorting within the priority queue with a radix sort.
Today when a task is submitted we perform a map lookup to see if it can be added to an existing batch or not. With this change we allow client code to create its own dedicated queue of tasks. The entries in the per-priority-level queues are themselves queues, one for each executor, representing the batches to be run.
Today each task in the queue holds a reference to its executor, but the executor used to run a task may belong to a different task in the same batch. In practice we know they're the same executor (that's how batches are defined) but we cannot express this knowledge in the type system so we have to do a bunch of unchecked casting to work around it. With this change we associate each per-executor queue directly with its executor, avoiding the need to do all this unchecked casting.
Today the master service must block its thread while waiting for each task to complete, because otherwise the executor would start to process the next task in the queue. This makes testing using a
DeterministicTaskQueue
harder (seeFakeThreadPoolMasterService
). This change avoids enqueueing tasks on theThreadPoolExecutor
unless there is genuinely work to do, although it leaves the removal of the actual blocking to a followup.Closes #81626