Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update default value of druid.indexer.tasklock.batchAllocationWaitTime to zero #16578

Merged
merged 2 commits into from
Jun 10, 2024

Conversation

kfaraz
Copy link
Contributor

@kfaraz kfaraz commented Jun 10, 2024

Description

The current default batchAllocationWaitTime is 500ms. This means that the SegmentAllocationQueue
waits for 500ms before a batch is processed to allow it to accumulate more requests.
In some cases, particularly MSQ tasks, this adds an unnecessary penalty to the task duration.

Changes

With batch segment allocation enabled and batchAllocationWaitTime=0,
the segment allocation queue becomes more adaptive and exhibits the following behaviour:

  • Process the first segment allocation request added to the queue immediately.
  • While the first request is being processed, let any subsequent requests accumulate in the queue.
  • Once the first request is processed, pick up the next batch in the queue for processing.

Testing

We have had batchAllocationWaitTime=0 in several of our production clusters for a couple of months now.
It has particularly helped with MSQ tasks, which don't really benefit from batching as segments are allocated serially.
Waiting 500ms for every allocation increased MSQ task durations leading to unnecessary compute costs.

Release Note

Update default value of druid.indexer.tasklock.batchAllocationWaitTime to 0.
Thus, a segment allocation request is processed immediately unless there are already some requests queued before this one. While in queue, a segment allocation request may get clubbed together with other similar requests into a batch to reduce load on the metadata store.

docs/configuration/index.md Outdated Show resolved Hide resolved
@kfaraz kfaraz changed the title Set default value of druid.indexer.tasklock.batchAllocationWaitTime to zero Update default value of druid.indexer.tasklock.batchAllocationWaitTime to zero Jun 10, 2024
@kfaraz kfaraz requested a review from cryptoe June 10, 2024 03:32
Copy link
Contributor

@LakshSingla LakshSingla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM! Can you please complete the release notes in the P description.

@@ -1120,7 +1120,7 @@ These Overlord static configurations can be defined in the `overlord/runtime.pro
|`druid.indexer.storage.recentlyFinishedThreshold`|Duration of time to store task results. Default is 24 hours. If you have hundreds of tasks running in a day, consider increasing this threshold.|`PT24H`|
|`druid.indexer.tasklock.forceTimeChunkLock`|_**Setting this to false is still experimental**_<br/> If set, all tasks are enforced to use time chunk lock. If not set, each task automatically chooses a lock type to use. This configuration can be overwritten by setting `forceTimeChunkLock` in the [task context](../ingestion/tasks.md#context). See [Task Locking & Priority](../ingestion/tasks.md#context) for more details about locking in tasks.|true|
|`druid.indexer.tasklock.batchSegmentAllocation`| If set to true, Druid performs segment allocate actions in batches to improve throughput and reduce the average `task/action/run/time`. See [batching `segmentAllocate` actions](../ingestion/tasks.md#batching-segmentallocate-actions) for details.|true|
|`druid.indexer.tasklock.batchAllocationWaitTime`|Number of milliseconds after Druid adds the first segment allocate action to a batch, until it executes the batch. Allows the batch to add more requests and improve the average segment allocation run time. This configuration takes effect only if `batchSegmentAllocation` is enabled.|500|
|`druid.indexer.tasklock.batchAllocationWaitTime`|Number of milliseconds after Druid adds the first segment allocate action to a batch, until it executes the batch. Allows the batch to add more requests and improve the average segment allocation run time. This configuration takes effect only if `batchSegmentAllocation` is enabled.|0|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we specify the behavior when this value is set to 0, as done in the description of the PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it is self-explanatory, a value of zero means no waiting unless something else is already being processed. If you still feel it warrants clarification, I can add a line to this effect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me

@kfaraz
Copy link
Contributor Author

kfaraz commented Jun 10, 2024

Thanks for the review, @LakshSingla ! I have updated the release note. Please let me know if it seems adequate.

@LakshSingla
Copy link
Contributor

Thanks for the changes!

@LakshSingla LakshSingla merged commit e4fdf10 into apache:master Jun 10, 2024
87 checks passed
@cryptoe
Copy link
Contributor

cryptoe commented Jun 10, 2024

Thanks for the change @kfaraz

@kfaraz kfaraz deleted the zero_default_batch_alloc_wait_time branch July 2, 2024 10:20
@kfaraz kfaraz added this to the 31.0.0 milestone Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants