Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the fallback strategy when the broker is unable to materialize the subquery's results as frames for estimating the bytes #16679

Merged
merged 8 commits into from
Jul 12, 2024

Conversation

LakshSingla
Copy link
Contributor

@LakshSingla LakshSingla commented Jul 1, 2024

Description

Better fallback strategy when the broker is unable to materialize the subquery's results as frames for estimating the bytes:
a. We don't touch the subquery sequence till we know that we can materialize the result as frames. Otherwise, aggregators holding some resources can get closed and the fallback doesn't work properly. I have added a test case to verify that.
b. Remove the ad-hoc fallback case, which I haven't seen happen. Most of the queries fallback due to insufficient types present at the runtime to generate the query. But if we are unable to materialize the result as bytes due to any unforeseen reason, the current fallback path is incorrect. The other alternative is to rerun the whole subquery, but that will degrade the subquery performance significantly, and we should rather throw an exception in that case so that the user can disable maxSubqueryBytes.

Release note


Key changed/added classes in this PR
  • MyFoo
  • OurBar
  • TheirBaz

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

"d0",
ColumnType.STRING
))
.addAggregator(new CardinalityAggregatorFactory(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: doesn't it feel kind of strange to be mixing non-datasketches approx distinct count with datasketches extensions tests?

.putAll(QUERY_CONTEXT_DEFAULT)
.put(QueryContexts.MAX_SUBQUERY_BYTES_KEY, "100000")
// Disallows the fallback to row based limiting
.put(QueryContexts.MAX_SUBQUERY_ROWS_KEY, "10")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the test case. Ideally we would want a similar test in processing module. Is that possible ? Maybe use a test aggregator ?

@LakshSingla
Copy link
Contributor Author

LakshSingla commented Jul 11, 2024

@cryptoe Unfortunately, I couldn't get it to work with the client query segment walker tests - because that portion of the code base heavily mimics broker-historical interaction. So even with modifications to that portion of the code, we are getting a new sequence that will work with and without the patch. I have attached the diff I was trying below.

single-serve.patch

@cryptoe
Copy link
Contributor

cryptoe commented Jul 12, 2024

Since we warp the test framework returns a repeatable sequence, we would need to change the underlying UT framework which is not in the scope of this PR.

We already have a test hence going forward with merge.

@LakshSingla
Copy link
Contributor Author

The failing coverage is on a defensive check we don't expect to hit.

@LakshSingla LakshSingla merged commit 3a1b437 into apache:master Jul 12, 2024
83 of 88 checks passed
@LakshSingla LakshSingla deleted the fallback-not-working-2 branch July 12, 2024 16:19
sreemanamala pushed a commit to sreemanamala/druid that referenced this pull request Aug 6, 2024
…e the subquery's results as frames for estimating the bytes (apache#16679)

Better fallback strategy when the broker is unable to materialize the subquery's results as frames for estimating the bytes:
a. We don't touch the subquery sequence till we know that we can materialize the result as frames
@kfaraz kfaraz added this to the 31.0.0 milestone Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants