-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add query context parameter for segment load wait #15076
Add query context parameter for segment load wait #15076
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments.
Overall LGTM.
docs/multi-stage-query/reference.md
Outdated
@@ -246,6 +246,7 @@ The following table lists the context parameters for the MSQ task engine: | |||
| `durableShuffleStorage` | SELECT, INSERT, REPLACE <br /><br />Whether to use durable storage for shuffle mesh. To use this feature, configure the durable storage at the server level using `druid.msq.intermediate.storage.enable=true`). If these properties are not configured, any query with the context variable `durableShuffleStorage=true` fails with a configuration error. <br /><br /> | `false` | | |||
| `faultTolerance` | SELECT, INSERT, REPLACE<br /><br /> Whether to turn on fault tolerance mode or not. Failed workers are retried based on [Limits](#limits). Cannot be used when `durableShuffleStorage` is explicitly set to false. | `false` | | |||
| `selectDestination` | SELECT<br /><br /> Controls where the final result of the select query is written. <br />Use `taskReport`(the default) to write select results to the task report. <b> This is not scalable since task reports size explodes for large results </b> <br/>Use `durableStorage` to write results to durable storage location. <b>For large results sets, its recommended to use `durableStorage` </b>. To configure durable storage see [`this`](#durable-storage) section. | `taskReport` | | |||
| `segmentLoadWait` | INSERT, REPLACE<br /><br /> Whether the controller should wait for segments to be loaded before exiting. If this is true, the controller queries the broker and waits till the segments created (if any) have been loaded by the load rules. The controller also provides this information in the live reports and task reports. If this is false, the controller exits immediately after finishing the query. | `false` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
segmentHandedOff ? How does this name sound ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or waitTillSegmentsLoad
@@ -159,29 +159,18 @@ public void waitForSegmentsToLoad() | |||
return; | |||
} | |||
|
|||
Iterator<String> iterator = versionsToAwait.iterator(); | |||
log.info( | |||
log.debug( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we log.info this every minute ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And update the log message to say that even if the task is cancelled, the segments will continue to load.
{ | ||
Request request = brokerClient.makeRequest(HttpMethod.POST, "/druid/v2/sql/"); | ||
SqlQuery sqlQuery = new SqlQuery(StringUtils.format(LOAD_QUERY, datasource, version), | ||
SqlQuery sqlQuery = new SqlQuery(StringUtils.format(LOAD_QUERY, datasource, versionsInClauseString), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While running ut's I also saw
2023-10-05T11:21:28,837 WARN [query-2cd60052-3011-47ae-87be-ec47100f649d-segment-load-waiter-0] org.apache.druid.msq.exec.SegmentLoadStatusFetcher - Exception occurred while waiting for segments to load. Exiting.
java.lang.NullPointerException: null
at org.apache.druid.msq.exec.SegmentLoadStatusFetcher.fetchLoadStatusForVersion(SegmentLoadStatusFetcher.java:262) ~[classes/:?]
at org.apache.druid.msq.exec.SegmentLoadStatusFetcher.lambda$waitForSegmentsToLoad$0(SegmentLoadStatusFetcher.java:174) ~[classes/:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131) ~[guava-31.1-jre.jar:?]
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74) ~[guava-31.1-jre.jar:?]
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82) ~[guava-31.1-jre.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
2023-10-05T11:21:28,846 INFO [main] org.apache.druid.msq.exec.WorkerImpl - Stopping gracefully for taskId [query-2cd60052-3011-47ae-87be-ec47100f649d-worker0_0]
2023-10-05T11:21:28,848 INFO [main] org.apache.druid.msq.test.MSQTestBase - found generated segments: DataSegment{binaryVersion=9, id=foo1_-146136543-09-08T08:23:32.096Z_146140482-04-24T15:36:27.903Z_test, loadSpec={type=>local, path=>/var/folders/sx/n0v16tnj6mvf6hd14l6j6dth0000gn/T/junit8066366800031570368/localsegments/foo1/-146136543-09-08T08:23:32.096Z_146140482-04-24T15:36:27.903Z/test/0/index/}, dimensions=[dim_mv], metrics=[], shardSpec=NumberedShardSpec{partitionNum=0, partitions=100}, lastCompactionState=null, size=1037}
2023-10-05T11:21:28,859 INFO [main] org.apache.druid.msq.test.MSQTestBase - Found spec: {
"query" : {
Lets fix it in this PR as well.
docs/multi-stage-query/reference.md
Outdated
@@ -246,6 +246,7 @@ The following table lists the context parameters for the MSQ task engine: | |||
| `durableShuffleStorage` | SELECT, INSERT, REPLACE <br /><br />Whether to use durable storage for shuffle mesh. To use this feature, configure the durable storage at the server level using `druid.msq.intermediate.storage.enable=true`). If these properties are not configured, any query with the context variable `durableShuffleStorage=true` fails with a configuration error. <br /><br /> | `false` | | |||
| `faultTolerance` | SELECT, INSERT, REPLACE<br /><br /> Whether to turn on fault tolerance mode or not. Failed workers are retried based on [Limits](#limits). Cannot be used when `durableShuffleStorage` is explicitly set to false. | `false` | | |||
| `selectDestination` | SELECT<br /><br /> Controls where the final result of the select query is written. <br />Use `taskReport`(the default) to write select results to the task report. <b> This is not scalable since task reports size explodes for large results </b> <br/>Use `durableStorage` to write results to durable storage location. <b>For large results sets, its recommended to use `durableStorage` </b>. To configure durable storage see [`this`](#durable-storage) section. | `taskReport` | | |||
| `segmentLoadWait` | INSERT, REPLACE<br /><br /> Whether the controller should wait for segments to be loaded before exiting. If this is true, the controller queries the broker and waits till the segments created (if any) have been loaded by the load rules. The controller also provides this information in the live reports and task reports. If this is false, the controller exits immediately after finishing the query. | `false` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems very geared toward developers, and not the users. We can remove the fact that it queries the broker (since that is an implementation detail).
WDYT of something along the lines:
| `segmentLoadWait` | INSERT, REPLACE<br /><br /> Whether the controller should wait for segments to be loaded before exiting. If this is true, the controller queries the broker and waits till the segments created (if any) have been loaded by the load rules. The controller also provides this information in the live reports and task reports. If this is false, the controller exits immediately after finishing the query. | `false` | | |
| `segmentLoadWait` | INSERT, REPLACE<br /><br /> If set, the ingest query waits for the generated segment to be loaded before exiting, else the ingest query exits without waiting. The task and live reports contain the information (about ??) if this flag is set. The drawback is that the tasks stall till the segments are loaded, however ensures that... (advantage) | `false` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM!!
This relies on the work done in apache#14322 and apache#15076. It allows the user to set waitTillSegmentsLoad in the query context (if they want, else it defaults to true) and shows the results in the UI :
This relies on the work done in #14322 and #15076. It allows the user to set waitTillSegmentsLoad in the query context (if they want, else it defaults to true) and shows the results in the UI : Co-authored-by: Sébastien <[email protected]>
Add segmentLoadWait as a query context parameter. If this is true, the controller queries the broker and waits till the segments created (if any) have been loaded by the load rules. The controller also provides this information in the live reports and task reports. If this is false, the controller exits immediately after finishing the query.
This relies on the work done in apache#14322 and apache#15076. It allows the user to set waitTillSegmentsLoad in the query context (if they want, else it defaults to true) and shows the results in the UI :
Add segmentLoadWait as a query context parameter. If this is true, the controller queries the broker and waits till the segments created (if any) have been loaded by the load rules. The controller also provides this information in the live reports and task reports. If this is false, the controller exits immediately after finishing the query.
This relies on the work done in apache#14322 and apache#15076. It allows the user to set waitTillSegmentsLoad in the query context (if they want, else it defaults to true) and shows the results in the UI :
#14322 added the feature of allowing the waiting for segments to be loaded to the MSQ controller. This adds a context parameter to control if this behaviour should be used. It also optimizes the query to make fewer calls to the broker by clubbing all the versions into the same request.
Notes:
waitTillSegmentsLoad
as a query context parameter. If this is true, the controller queries the broker and waits till the segments created (if any) have been loaded by the load rules. The controller also provides this information in the live reports and task reports. If this is false, the controller exits immediately after finishing the query.This PR has: