-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update default value of druid.indexer.tasklock.batchAllocationWaitTime
to zero
#16578
Update default value of druid.indexer.tasklock.batchAllocationWaitTime
to zero
#16578
Conversation
druid.indexer.tasklock.batchAllocationWaitTime
to zerodruid.indexer.tasklock.batchAllocationWaitTime
to zero
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM! Can you please complete the release notes in the P description.
@@ -1120,7 +1120,7 @@ These Overlord static configurations can be defined in the `overlord/runtime.pro | |||
|`druid.indexer.storage.recentlyFinishedThreshold`|Duration of time to store task results. Default is 24 hours. If you have hundreds of tasks running in a day, consider increasing this threshold.|`PT24H`| | |||
|`druid.indexer.tasklock.forceTimeChunkLock`|_**Setting this to false is still experimental**_<br/> If set, all tasks are enforced to use time chunk lock. If not set, each task automatically chooses a lock type to use. This configuration can be overwritten by setting `forceTimeChunkLock` in the [task context](../ingestion/tasks.md#context). See [Task Locking & Priority](../ingestion/tasks.md#context) for more details about locking in tasks.|true| | |||
|`druid.indexer.tasklock.batchSegmentAllocation`| If set to true, Druid performs segment allocate actions in batches to improve throughput and reduce the average `task/action/run/time`. See [batching `segmentAllocate` actions](../ingestion/tasks.md#batching-segmentallocate-actions) for details.|true| | |||
|`druid.indexer.tasklock.batchAllocationWaitTime`|Number of milliseconds after Druid adds the first segment allocate action to a batch, until it executes the batch. Allows the batch to add more requests and improve the average segment allocation run time. This configuration takes effect only if `batchSegmentAllocation` is enabled.|500| | |||
|`druid.indexer.tasklock.batchAllocationWaitTime`|Number of milliseconds after Druid adds the first segment allocate action to a batch, until it executes the batch. Allows the batch to add more requests and improve the average segment allocation run time. This configuration takes effect only if `batchSegmentAllocation` is enabled.|0| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we specify the behavior when this value is set to 0, as done in the description of the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel it is self-explanatory, a value of zero means no waiting unless something else is already being processed. If you still feel it warrants clarification, I can add a line to this effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me
Thanks for the review, @LakshSingla ! I have updated the release note. Please let me know if it seems adequate. |
Thanks for the changes! |
Thanks for the change @kfaraz |
Description
The current default
batchAllocationWaitTime
is 500ms. This means that theSegmentAllocationQueue
waits for 500ms before a batch is processed to allow it to accumulate more requests.
In some cases, particularly MSQ tasks, this adds an unnecessary penalty to the task duration.
Changes
With batch segment allocation enabled and
batchAllocationWaitTime=0
,the segment allocation queue becomes more adaptive and exhibits the following behaviour:
Testing
We have had
batchAllocationWaitTime=0
in several of our production clusters for a couple of months now.It has particularly helped with MSQ tasks, which don't really benefit from batching as segments are allocated serially.
Waiting 500ms for every allocation increased MSQ task durations leading to unnecessary compute costs.
Release Note
Update default value of
druid.indexer.tasklock.batchAllocationWaitTime
to0
.Thus, a segment allocation request is processed immediately unless there are already some requests queued before this one. While in queue, a segment allocation request may get clubbed together with other similar requests into a batch to reduce load on the metadata store.