Merge core CoordinatorClient with MSQ CoordinatorServiceClient. #14652

gianm · 2023-07-25T04:26:12Z

Continuing the work from #12696, this patch merges the MSQ CoordinatorServiceClient into the core CoordinatorClient, yielding a single interface that serves both needs and is based on the ServiceClient RPC system rather than DruidLeaderClient.

Release notes: Also removes the backwards-compatibility code for the handoff API in CoordinatorBasedSegmentHandoffNotifier, because the new API was added in 0.14.0. That's long enough ago that we don't need backwards compatibility for rolling updates.

Continuing the work from apache#12696, this patch merges the MSQ CoordinatorServiceClient into the core CoordinatorClient, yielding a single interface that serves both needs and is based on the ServiceClient RPC system rather than DruidLeaderClient. Also removes the backwards-compatibility code for the handoff API in CoordinatorBasedSegmentHandoffNotifier, because the new API was added in 0.14.0. That's long enough ago that we don't need backwards compatibility for rolling updates.

kfaraz

Looks good, minor queries.

kfaraz · 2023-07-25T13:39:41Z

...r/src/main/java/org/apache/druid/segment/handoff/CoordinatorBasedSegmentHandoffNotifier.java

@@ -123,7 +107,7 @@ void checkForSegmentHandoffs()
        }
      }
      if (!handOffCallbacks.isEmpty()) {
-        log.warn("Still waiting for Handoff for [%d] Segments", handOffCallbacks.size());
+        log.info("Still waiting for handoff for [%d] segments", handOffCallbacks.size());


Nit: "handoff of x segments"?

Should we raise an alert if handoff wait time exceeds a threshold?

Nit: "handoff of x segments"?

Sure, that sounds nicer to me as well. I will change it if there is some other reason to make changes (like if the CI fails).

Should we raise an alert if handoff wait time exceeds a threshold?

There's already an alert raised if the handoffConditionTimeout is exceeded, which post #14539 would default to 15 mins.

kfaraz · 2023-07-26T08:08:02Z

indexing-service/src/main/java/org/apache/druid/indexing/input/DruidInputSource.java

-            Collections.singletonList(interval)
+        usedSegments = FutureUtils.getUnchecked(
+            coordinatorClient.fetchUsedSegments(dataSource, Collections.singletonList(interval)),
+            true


I wonder if we still need the retry logic in the catch block below. It would be handled by the new CoordinatorClient itself.

Good point, I'll remove this part.

retry policy and h

kfaraz · 2023-07-27T13:43:19Z

server/src/main/java/org/apache/druid/rpc/StandardRetryPolicy.java

+   * Retry policy that uses up to about an hour of total wait time. Note that this is just the total waiting time
+   * between attempts. It does not include the time that each attempt takes to execute.
+   */
+  public static StandardRetryPolicy aboutAnHour()


Nit: Would StandardRetryPolicy.retryUptoAnHour() or StandardRetryPolicy.retryForAnHour() communicate the intent better?

The "about" part could be left out as the approximate nature of the backoffs is probably a given. Although, I don't feel strongly about it.

Hmm, to me they seem similar, so I'm inclined to merge the patch rather than have CI run again 🙂

Saving the planet by doing fewer CI runs!

Haha, works for me. 🌲

gianm added the Release Notes label Jul 25, 2023

gianm added 2 commits July 25, 2023 00:46

Fixups.

b873466

Trigger GHA.

ff62ebf

kfaraz approved these changes Jul 25, 2023

View reviewed changes

kfaraz reviewed Jul 26, 2023

View reviewed changes

gianm added 3 commits July 26, 2023 13:01

Remove unnecessary retrying in DruidInputSource. Add "about an hour"

25d3df0

retry policy and h

Merge branch 'master' into rpc-coordinator-client

6041dc6

EasyMock

6a0130e

kfaraz reviewed Jul 27, 2023

View reviewed changes

gianm merged commit 986a271 into apache:master Jul 27, 2023

gianm deleted the rpc-coordinator-client branch July 27, 2023 20:23

LakshSingla added this to the 28.0 milestone Oct 12, 2023

LakshSingla mentioned this pull request Nov 4, 2023

[DRAFT] 28.0.0 release notes #15326

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge core CoordinatorClient with MSQ CoordinatorServiceClient. #14652

Merge core CoordinatorClient with MSQ CoordinatorServiceClient. #14652

gianm commented Jul 25, 2023 •

edited

Loading

kfaraz left a comment

kfaraz Jul 25, 2023

kfaraz Jul 25, 2023

gianm Jul 25, 2023 •

edited

Loading

kfaraz Jul 26, 2023

gianm Jul 26, 2023

kfaraz Jul 27, 2023 •

edited

Loading

gianm Jul 27, 2023

kfaraz Jul 28, 2023

Merge core CoordinatorClient with MSQ CoordinatorServiceClient. #14652

Merge core CoordinatorClient with MSQ CoordinatorServiceClient. #14652

Conversation

gianm commented Jul 25, 2023 • edited Loading

kfaraz left a comment

Choose a reason for hiding this comment

kfaraz Jul 25, 2023

Choose a reason for hiding this comment

kfaraz Jul 25, 2023

Choose a reason for hiding this comment

gianm Jul 25, 2023 • edited Loading

Choose a reason for hiding this comment

kfaraz Jul 26, 2023

Choose a reason for hiding this comment

gianm Jul 26, 2023

Choose a reason for hiding this comment

kfaraz Jul 27, 2023 • edited Loading

Choose a reason for hiding this comment

gianm Jul 27, 2023

Choose a reason for hiding this comment

kfaraz Jul 28, 2023

Choose a reason for hiding this comment

gianm commented Jul 25, 2023 •

edited

Loading

gianm Jul 25, 2023 •

edited

Loading

kfaraz Jul 27, 2023 •

edited

Loading