Add code to wait for segments generated to be loaded on historicals #14322

adarshsanjeev · 2023-05-22T03:17:36Z

Currently, after an MSQ query, the web console is responsible for waiting for the segments to load. It does so by checking if there are any segments loading into the datasource ingested into, which can cause some issues, like in cases where the segments would never be loaded, or would end up waiting for other ingests as well.

This PR shifts this responsibility from the web console to the controller, which would have the list of segments created.
This PR also introduces a new field in the reports, that provides a realtime update of the current progress of waiting for the task.

The controller will release locks before waiting for the segments so that it does not block any other tasks.
If an exception is thrown while waiting, this is only logged, and will not fail the entire task.

 "segmentLoadStatus": {
          "state": "SUCCESS",
          "dataSource": "kttm_simple",
          "startTime": "2022-09-14T23:12:09.266Z",
          "duration": 15,
          "totalSegments": 1,
          "usedSegments": 1,
          "precachedSegments": 0,
          "onDemandSegments": 0,
          "pendingSegments": 0,
          "unknownSegments": 0
        }

This PR has:

…ments-wait

server/src/main/java/org/apache/druid/discovery/BrokerClient.java

…ments-wait

server/src/test/java/org/apache/druid/discovery/BrokerClientTest.java

+    Injector injector = Initialization.makeInjectorWithModules(
+        GuiceInjectors.makeStartupInjector(), ImmutableList.<Module>of(
+            binder -> {
+              JsonConfigProvider.bindInstance(
+                  binder,
+                  Key.get(DruidNode.class, Self.class),
+                  node
+              );
+              binder.bind(Integer.class).annotatedWith(Names.named("port")).toInstance(node.getPlaintextPort());
+              binder.bind(JettyServerInitializer.class).to(DruidLeaderClientTest.TestJettyServerInitializer.class).in(
+                  LazySingleton.class);
+              Jerseys.addResource(binder, DruidLeaderClientTest.SimpleResource.class);
+              LifecycleModule.register(binder, Server.class);
+            }
+        )
+    );


LakshSingla

Commented a few line items on the PR. Also, can you please use DruidException class for errors that might get surfaced to the user, either due to their fault or ours? (i.e. all the exceptions that aren't caught and rethrown).

server/src/main/java/org/apache/druid/discovery/BrokerClient.java

LakshSingla · 2023-08-10T04:31:39Z

server/src/main/java/org/apache/druid/discovery/BrokerClient.java

+  {
+    StringFullResponseHandler responseHandler = new StringFullResponseHandler(StandardCharsets.UTF_8);
+
+    for (int counter = 0; counter < MAX_RETRIES; counter++) {


I think we should wait for a while before retrying. This would prevent bombarding the Broker with requests in a short span of time, and also allow any transient failures to auto-resolve before sending another request. We should also have a back-off strategy here.

Consider refactoring it to RetryUtils.retry which does it for us.

LakshSingla · 2023-08-10T04:34:54Z

server/src/main/java/org/apache/druid/discovery/BrokerClient.java

+      }
+      catch (IOException | ChannelException ex) {
+        // can happen if the node is stopped.
+        log.warn(ex, "Request [%s] failed.", request.getUrl());


This would log it after each retry. Since the retries are happening in a short span, there's a high likelihood that we would be posting the same stack over and over. This should log once after all the retries are exhausted. If you refactor it to RetryUtils, I think it also handles that for you.

server/src/main/java/org/apache/druid/discovery/BrokerClient.java

LakshSingla · 2023-08-10T04:40:13Z

server/src/main/java/org/apache/druid/discovery/ClientUtils.java

+  public static String pickOneHost(DruidNodeDiscovery druidNodeDiscovery)
+  {
+    Iterator<DiscoveryDruidNode> iter = druidNodeDiscovery.getAllNodes().iterator();
+    if (iter.hasNext()) {


Seems to me that this would have the affinity of picking the first broker node all the time. It would be better if we choose this in a round-robin fashion or at random. Is there any pre-existing code that gives a server at random, seems like this would be a common use case?

server/src/main/java/org/apache/druid/discovery/ClientUtils.java

...nsions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/SegmentLoadWaiter.java

cryptoe

Thanks for the PR. Left some comments.

cryptoe · 2023-08-16T17:01:52Z

docs/api-reference/sql-ingestion-api.md

@@ -257,7 +257,15 @@ The response shows an example report for a query.
        "startTime": "2022-09-14T22:12:09.266Z",
        "durationMs": 28227,
        "pendingTasks": 0,
-        "runningTasks": 2
+        "runningTasks": 2,
+        "segmentLoadWaiterStatus": {


nit: segmentLoadStatus?
What is this start time ?
How would segments which match a drop rule get communicated to the console?

startTime is the time at which we started checking the load status. That with duration would give us a clear idea of when it started and when it ended, and this is the structure for other MSQ stages.

cryptoe · 2023-08-16T17:04:24Z

...nsions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/SegmentLoadWaiter.java

+  private static final long SLEEP_DURATION_MILLIS = TimeUnit.SECONDS.toMillis(5);
+  private static final long TIMEOUT_DURATION_MILLIS = TimeUnit.MINUTES.toMillis(10);
+  private static final String LOAD_QUERY = "SELECT COUNT(*) AS totalSegments,\n"
+                                           + "COUNT(*) FILTER (WHERE is_available = 0 AND is_published = 1 AND replication_factor != 0) AS loadingSegments\n"


Why is the replication factor filter needed here ?

Replication factor is used to filter out cold segments. Even if a cold segment is unavailable, we don't wait for it.

cryptoe · 2023-08-17T10:07:19Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

        performSegmentPublish(
            context.taskActionClient(),
            SegmentTransactionalInsertAction.overwriteAction(null, null, segmentsWithTombstones)
        );
      }
    } else if (!segments.isEmpty()) {
+      Set<String> versionsToAwait = segments.stream().map(DataSegment::getVersion).collect(Collectors.toSet());


There would always be one version rite ?
Can we add a check here ?

I thought this initially as well, but looking into the code, we actually generate the version based on the lock we acquire for the segment. So if there are multiple intervals we are replacing into and therefore multiple locks, we would have more than one version.

Oh i did not know that. Thanks for explaining the rational. We could also document this as well :)

cryptoe · 2023-08-17T10:15:39Z

...nsions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/SegmentLoadWaiter.java

+   * <br>
+   * Only expected to be called from the main controller thread.
+   */
+  public void waitForSegmentsToLoad()


We would want the experience on the console to be realtime ie like how counters currently work so that the console can render the waiting segment status progress to the end user.

With this approach the main thread looks to be blocked until the segment loading is complete and the info is only included in the task report once the call returns?
You could add a method which check if the segment loading is complete. If not get the status of loading and write it in the task report.
We would want to do this until the segment loading is completed.

Discussed offline, the controller thread updates the status periodically without blocking. This status is included from the liveReports() in the controller. I have tested this out as well and the realtime information is available to the console

…ments-wait

cryptoe

Left some comments. Overall LGTM!

cryptoe · 2023-08-27T13:05:48Z

docs/api-reference/sql-ingestion-api.md

+          "totalSegments": 1,
+          "usedSegments": 1,
+          "precachedSegments": 0,
+          "asyncOnlySegments": 0,


Suggested change

"asyncOnlySegments": 0,

"onDemandSegments": 0,

Async only seems weird to me since its ties the execution mode to segment.

cryptoe · 2023-08-27T13:16:44Z

extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java

-        // If successful and there are segments created, segmentLoadWaiter should wait for them to become available.
-        segmentLoadWaiter.waitForSegmentsToLoad();
+    try {
+      final List<TaskLock> locks = context.taskActionClient().submit(new LockListAction());


I guess this can go in a separate method with Exception Handling clearly mentioning unable to release locks.

cryptoe · 2023-08-27T13:19:23Z

...nsions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/SegmentLoadWaiter.java

  private static final long SLEEP_DURATION_MILLIS = TimeUnit.SECONDS.toMillis(5);
  private static final long TIMEOUT_DURATION_MILLIS = TimeUnit.MINUTES.toMillis(10);
-  private static final String LOAD_QUERY = "SELECT COUNT(*) AS totalSegments,\n"
-                                           + "COUNT(*) FILTER (WHERE is_available = 0 AND is_published = 1 AND replication_factor != 0) AS loadingSegments\n"
+  private static final String LOAD_QUERY = "SELECT COUNT(*) AS usedSegments,\n"


Probably a writeup here of what each replication_factor means and link to : #14403 would be helpful.

That PR might needs an updated description though .

cryptoe · 2023-09-04T07:25:56Z

...nsions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/SegmentLoadWaiter.java

+
+  public enum State
+  {
+    INIT,


Could you add more dev notes to this enum.

cryptoe · 2023-09-04T07:28:18Z

server/src/main/java/org/apache/druid/discovery/BrokerClient.java

+          HttpResponseStatus responseStatus = fullResponseHolder.getResponse().getStatus();
+          if (HttpResponseStatus.SERVICE_UNAVAILABLE.equals(responseStatus)
+              || HttpResponseStatus.GATEWAY_TIMEOUT.equals(responseStatus)) {
+            throw new IOE(StringUtils.format("Request to broker failed due to failed response status: [%s]", responseStatus));


DruidException defensive exceptions here ?

Should these be defensive? These would be thrown if the broker is not reachable which is a condition that can happen without any bugs in the code, right?

…ments-wait

...nsions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/SegmentLoadWaiter.java

…id/msq/exec/SegmentLoadWaiter.java

…oad (#15000) With PR #14322 , MSQ insert/Replace q's will wait for segment to be loaded on the historical's before finishing. The patch introduces a bug where in the main thread had a thread.sleep() which could not be interrupted via the cancel calls from the overlord. This new patch addressed that problem by moving the thread.sleep inside a thread of its own. Thus the main thread is now waiting on the future object of this execution. The cancel call can now shutdown the executor service via another method thus unblocking the main thread to proceed.

This relies on the work done in #14322 and #15076. It allows the user to set waitTillSegmentsLoad in the query context (if they want, else it defaults to true) and shows the results in the UI :

This relies on the work done in apache#14322 and apache#15076. It allows the user to set waitTillSegmentsLoad in the query context (if they want, else it defaults to true) and shows the results in the UI :

This relies on the work done in #14322 and #15076. It allows the user to set waitTillSegmentsLoad in the query context (if they want, else it defaults to true) and shows the results in the UI : Co-authored-by: Sébastien <[email protected]>

This relies on the work done in apache#14322 and apache#15076. It allows the user to set waitTillSegmentsLoad in the query context (if they want, else it defaults to true) and shows the results in the UI :

Add code to wait for segments generated to be loaded on historicals

2539f86

github-actions bot added the Area - Documentation label May 22, 2023

Add broker client and use it to query broker for segment load status

92b432d

adarshsanjeev changed the title ~~[WIP][Do not merge] Add code to wait for segments generated to be loaded on historicals~~ Add code to wait for segments generated to be loaded on historicals Jun 13, 2023

adarshsanjeev added 3 commits June 14, 2023 10:15

Temp

68210ba

Merge remote-tracking branch 'origin/master' into controller-load-seg…

5059fba

…ments-wait

Update query with replication factor

52aef1d

cryptoe added the Needs web console change Backend API changes that would benefit from frontend support in the web console label Jul 8, 2023

adarshsanjeev added 2 commits July 27, 2023 14:04

Merge remote-tracking branch 'origin/master' into controller-load-seg…

a7de731

…ments-wait

Cleanup code

529a577

github-advanced-security bot found potential problems Jul 27, 2023

View reviewed changes

server/src/main/java/org/apache/druid/discovery/BrokerClient.java Fixed Show fixed Hide fixed

adarshsanjeev added 5 commits August 3, 2023 11:42

Merge remote-tracking branch 'origin/master' into controller-load-seg…

650f75b

…ments-wait

Merge remote-tracking branch 'origin/master' into controller-load-seg…

c49f871

…ments-wait

Code cleanup

f19c8f7

Improve coverage

d86ad22

Merge remote-tracking branch 'origin/master' into controller-load-seg…

5f79d68

…ments-wait

github-advanced-security bot found potential problems Aug 9, 2023

View reviewed changes

Resolve build failures

47da704

LakshSingla reviewed Aug 10, 2023

View reviewed changes

Address review comments

e2b1787

cryptoe reviewed Aug 17, 2023

View reviewed changes

adarshsanjeev added 8 commits August 22, 2023 11:26

Merge remote-tracking branch 'origin/master' into controller-load-seg…

b922b2f

…ments-wait

Address review comments

cc5456b

Address review comments

2d9db10

Address review comments

2401c0c

Update names

23c2f89

Update names

c323474

Increase coverage

6b99073

Fix spelling

7675c8c

cryptoe approved these changes Sep 4, 2023

View reviewed changes

adarshsanjeev added 3 commits September 5, 2023 10:08

Add java docs

f709e71

Update error message

41aa0a5

Merge remote-tracking branch 'origin/master' into controller-load-seg…

ca398b7

…ments-wait

cryptoe reviewed Sep 6, 2023

View reviewed changes

...nsions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/SegmentLoadWaiter.java Outdated Show resolved Hide resolved

Update extensions-core/multi-stage-query/src/main/java/org/apache/dru…

7ca9465

…id/msq/exec/SegmentLoadWaiter.java

cryptoe merged commit 959148a into apache:master Sep 6, 2023
12 checks passed

cryptoe added the Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 label Sep 6, 2023

LakshSingla mentioned this pull request Sep 12, 2023

Delay reporting MSQ ingest success until segments are loaded #13770

Closed

cryptoe mentioned this pull request Sep 18, 2023

Allow cancellation of MSQ tasks if they are waiting for segments to load #15000

Merged

9 tasks

adarshsanjeev mentioned this pull request Oct 4, 2023

Add query context parameter for segment load wait #15076

Merged

10 tasks

lorem--ipsum mentioned this pull request Oct 9, 2023

Added UI support for waitTillSegmentsLoad #15110

Merged

LakshSingla added this to the 28.0 milestone Oct 12, 2023

LakshSingla mentioned this pull request Nov 4, 2023

[DRAFT] 28.0.0 release notes #15326

Closed

vogievetsky removed the Needs web console change Backend API changes that would benefit from frontend support in the web console label Dec 14, 2023

abhishekrb19 mentioned this pull request Jun 17, 2024

Fix retry logic in BrokerClient and flakey BrokerClientTest #16618

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add code to wait for segments generated to be loaded on historicals #14322

Add code to wait for segments generated to be loaded on historicals #14322

adarshsanjeev commented May 22, 2023 •

edited

Loading

LakshSingla left a comment

LakshSingla Aug 10, 2023

LakshSingla Aug 10, 2023

LakshSingla Aug 10, 2023

cryptoe left a comment

cryptoe Aug 16, 2023

adarshsanjeev Aug 18, 2023

cryptoe Aug 16, 2023

adarshsanjeev Aug 18, 2023

cryptoe Aug 17, 2023

adarshsanjeev Aug 18, 2023

cryptoe Aug 20, 2023

cryptoe Aug 17, 2023

adarshsanjeev Aug 18, 2023

cryptoe left a comment

cryptoe Aug 27, 2023

cryptoe Aug 27, 2023

cryptoe Aug 27, 2023

adarshsanjeev Sep 6, 2023

cryptoe Sep 4, 2023

adarshsanjeev Sep 6, 2023

cryptoe Sep 4, 2023

adarshsanjeev Sep 5, 2023

Add code to wait for segments generated to be loaded on historicals #14322

Add code to wait for segments generated to be loaded on historicals #14322

Conversation

adarshsanjeev commented May 22, 2023 • edited Loading

LakshSingla left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cryptoe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cryptoe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adarshsanjeev commented May 22, 2023 •

edited

Loading