From 22d59ea3d626cd1ed5b00c3923e28584bab1b436 Mon Sep 17 00:00:00 2001 From: Kashif Faraz Date: Mon, 10 Jun 2024 08:53:13 +0530 Subject: [PATCH 1/2] Set default segment batchAllocationWaitTime=0 --- docs/configuration/index.md | 4 ++-- .../apache/druid/indexing/overlord/config/TaskLockConfig.java | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/configuration/index.md b/docs/configuration/index.md index f8583b958411..e750f477972b 100644 --- a/docs/configuration/index.md +++ b/docs/configuration/index.md @@ -1114,13 +1114,13 @@ These Overlord static configurations can be defined in the `overlord/runtime.pro ##### Overlord operations |Property|Description|Default| -|--------|-----------|-------| +|--------|-----------|-----| |`druid.indexer.runner.type`|Indicates whether tasks should be run locally using `local` or in a distributed environment using `remote`. The recommended option is `httpRemote`, which is similar to `remote` but uses HTTP to interact with Middle Managers instead of ZooKeeper.|`httpRemote`| |`druid.indexer.storage.type`|Indicates whether incoming tasks should be stored locally (in heap) or in metadata storage. One of `local` or `metadata`. `local` is mainly for internal testing while `metadata` is recommended in production because storing incoming tasks in metadata storage allows for tasks to be resumed if the Overlord should fail.|`local`| |`druid.indexer.storage.recentlyFinishedThreshold`|Duration of time to store task results. Default is 24 hours. If you have hundreds of tasks running in a day, consider increasing this threshold.|`PT24H`| |`druid.indexer.tasklock.forceTimeChunkLock`|_**Setting this to false is still experimental**_
If set, all tasks are enforced to use time chunk lock. If not set, each task automatically chooses a lock type to use. This configuration can be overwritten by setting `forceTimeChunkLock` in the [task context](../ingestion/tasks.md#context). See [Task Locking & Priority](../ingestion/tasks.md#context) for more details about locking in tasks.|true| |`druid.indexer.tasklock.batchSegmentAllocation`| If set to true, Druid performs segment allocate actions in batches to improve throughput and reduce the average `task/action/run/time`. See [batching `segmentAllocate` actions](../ingestion/tasks.md#batching-segmentallocate-actions) for details.|true| -|`druid.indexer.tasklock.batchAllocationWaitTime`|Number of milliseconds after Druid adds the first segment allocate action to a batch, until it executes the batch. Allows the batch to add more requests and improve the average segment allocation run time. This configuration takes effect only if `batchSegmentAllocation` is enabled.|500| +|`druid.indexer.tasklock.batchAllocationWaitTime`|Number of milliseconds after Druid adds the first segment allocate action to a batch, until it executes the batch. Allows the batch to add more requests and improve the average segment allocation run time. This configuration takes effect only if `batchSegmentAllocation` is enabled.|0| |`druid.indexer.task.default.context`|Default task context that is applied to all tasks submitted to the Overlord. Any default in this config does not override neither the context values the user provides nor `druid.indexer.tasklock.forceTimeChunkLock`.|empty context| |`druid.indexer.queue.maxSize`|Maximum number of active tasks at one time.|`Integer.MAX_VALUE`| |`druid.indexer.queue.startDelay`|Sleep this long before starting Overlord queue management. This can be useful to give a cluster time to re-orient itself (for example, after a widespread network issue).|`PT1M`| diff --git a/indexing-service/src/main/java/org/apache/druid/indexing/overlord/config/TaskLockConfig.java b/indexing-service/src/main/java/org/apache/druid/indexing/overlord/config/TaskLockConfig.java index e750da6c1358..2634c4328fec 100644 --- a/indexing-service/src/main/java/org/apache/druid/indexing/overlord/config/TaskLockConfig.java +++ b/indexing-service/src/main/java/org/apache/druid/indexing/overlord/config/TaskLockConfig.java @@ -34,7 +34,7 @@ public class TaskLockConfig private boolean batchSegmentAllocation = true; @JsonProperty - private long batchAllocationWaitTime = 500L; + private long batchAllocationWaitTime = 0L; public boolean isForceTimeChunkLock() { From db55df1d921bee8645635a0c9072258d7c893175 Mon Sep 17 00:00:00 2001 From: Kashif Faraz Date: Mon, 10 Jun 2024 09:02:24 +0530 Subject: [PATCH 2/2] Update docs/configuration/index.md --- docs/configuration/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/configuration/index.md b/docs/configuration/index.md index e750f477972b..1976657c41ec 100644 --- a/docs/configuration/index.md +++ b/docs/configuration/index.md @@ -1114,7 +1114,7 @@ These Overlord static configurations can be defined in the `overlord/runtime.pro ##### Overlord operations |Property|Description|Default| -|--------|-----------|-----| +|--------|-----------|-------| |`druid.indexer.runner.type`|Indicates whether tasks should be run locally using `local` or in a distributed environment using `remote`. The recommended option is `httpRemote`, which is similar to `remote` but uses HTTP to interact with Middle Managers instead of ZooKeeper.|`httpRemote`| |`druid.indexer.storage.type`|Indicates whether incoming tasks should be stored locally (in heap) or in metadata storage. One of `local` or `metadata`. `local` is mainly for internal testing while `metadata` is recommended in production because storing incoming tasks in metadata storage allows for tasks to be resumed if the Overlord should fail.|`local`| |`druid.indexer.storage.recentlyFinishedThreshold`|Duration of time to store task results. Default is 24 hours. If you have hundreds of tasks running in a day, consider increasing this threshold.|`PT24H`|