Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge core CoordinatorClient with MSQ CoordinatorServiceClient. #14652

Merged
merged 6 commits into from
Jul 27, 2023

Conversation

gianm
Copy link
Contributor

@gianm gianm commented Jul 25, 2023

Continuing the work from #12696, this patch merges the MSQ CoordinatorServiceClient into the core CoordinatorClient, yielding a single interface that serves both needs and is based on the ServiceClient RPC system rather than DruidLeaderClient.

Release notes: Also removes the backwards-compatibility code for the handoff API in CoordinatorBasedSegmentHandoffNotifier, because the new API was added in 0.14.0. That's long enough ago that we don't need backwards compatibility for rolling updates.

Continuing the work from apache#12696, this patch merges the MSQ
CoordinatorServiceClient into the core CoordinatorClient, yielding a single
interface that serves both needs and is based on the ServiceClient RPC
system rather than DruidLeaderClient.

Also removes the backwards-compatibility code for the handoff API in
CoordinatorBasedSegmentHandoffNotifier, because the new API was added
in 0.14.0. That's long enough ago that we don't need backwards
compatibility for rolling updates.
Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, minor queries.

@@ -123,7 +107,7 @@ void checkForSegmentHandoffs()
}
}
if (!handOffCallbacks.isEmpty()) {
log.warn("Still waiting for Handoff for [%d] Segments", handOffCallbacks.size());
log.info("Still waiting for handoff for [%d] segments", handOffCallbacks.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "handoff of x segments"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we raise an alert if handoff wait time exceeds a threshold?

Copy link
Contributor Author

@gianm gianm Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "handoff of x segments"?

Sure, that sounds nicer to me as well. I will change it if there is some other reason to make changes (like if the CI fails).

Should we raise an alert if handoff wait time exceeds a threshold?

There's already an alert raised if the handoffConditionTimeout is exceeded, which post #14539 would default to 15 mins.

Collections.singletonList(interval)
usedSegments = FutureUtils.getUnchecked(
coordinatorClient.fetchUsedSegments(dataSource, Collections.singletonList(interval)),
true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we still need the retry logic in the catch block below. It would be handled by the new CoordinatorClient itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll remove this part.

* Retry policy that uses up to about an hour of total wait time. Note that this is just the total waiting time
* between attempts. It does not include the time that each attempt takes to execute.
*/
public static StandardRetryPolicy aboutAnHour()
Copy link
Contributor

@kfaraz kfaraz Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Would StandardRetryPolicy.retryUptoAnHour() or StandardRetryPolicy.retryForAnHour() communicate the intent better?

The "about" part could be left out as the approximate nature of the backoffs is probably a given. Although, I don't feel strongly about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, to me they seem similar, so I'm inclined to merge the patch rather than have CI run again 🙂

Saving the planet by doing fewer CI runs!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, works for me. 🌲

@gianm gianm merged commit 986a271 into apache:master Jul 27, 2023
@gianm gianm deleted the rpc-coordinator-client branch July 27, 2023 20:23
@LakshSingla LakshSingla added this to the 28.0 milestone Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants