-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the fallback strategy when the broker is unable to materialize the subquery's results as frames for estimating the bytes #16679
Improve the fallback strategy when the broker is unable to materialize the subquery's results as frames for estimating the bytes #16679
Conversation
"d0", | ||
ColumnType.STRING | ||
)) | ||
.addAggregator(new CardinalityAggregatorFactory( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: doesn't it feel kind of strange to be mixing non-datasketches approx distinct count with datasketches extensions tests?
.putAll(QUERY_CONTEXT_DEFAULT) | ||
.put(QueryContexts.MAX_SUBQUERY_BYTES_KEY, "100000") | ||
// Disallows the fallback to row based limiting | ||
.put(QueryContexts.MAX_SUBQUERY_ROWS_KEY, "10") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for the test case. Ideally we would want a similar test in processing module. Is that possible ? Maybe use a test aggregator ?
@cryptoe Unfortunately, I couldn't get it to work with the client query segment walker tests - because that portion of the code base heavily mimics broker-historical interaction. So even with modifications to that portion of the code, we are getting a new sequence that will work with and without the patch. I have attached the diff I was trying below. |
Since we warp the test framework returns a repeatable sequence, we would need to change the underlying UT framework which is not in the scope of this PR. We already have a test hence going forward with merge. |
The failing coverage is on a defensive check we don't expect to hit. |
…e the subquery's results as frames for estimating the bytes (apache#16679) Better fallback strategy when the broker is unable to materialize the subquery's results as frames for estimating the bytes: a. We don't touch the subquery sequence till we know that we can materialize the result as frames
Description
Better fallback strategy when the broker is unable to materialize the subquery's results as frames for estimating the bytes:
a. We don't touch the subquery sequence till we know that we can materialize the result as frames. Otherwise, aggregators holding some resources can get closed and the fallback doesn't work properly. I have added a test case to verify that.
b. Remove the ad-hoc fallback case, which I haven't seen happen. Most of the queries fallback due to insufficient types present at the runtime to generate the query. But if we are unable to materialize the result as bytes due to any unforeseen reason, the current fallback path is incorrect. The other alternative is to rerun the whole subquery, but that will degrade the subquery performance significantly, and we should rather throw an exception in that case so that the user can disable
maxSubqueryBytes
.Release note
Key changed/added classes in this PR
MyFoo
OurBar
TheirBaz
This PR has: