Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement aggregation pushdown in Pinot #6069

Merged
merged 6 commits into from
Aug 24, 2021

Conversation

elonazoulay
Copy link
Member

@elonazoulay elonazoulay commented Nov 24, 2020

Resolves #4140

@cla-bot cla-bot bot added the cla-signed label Nov 24, 2020
@elonazoulay
Copy link
Member Author

Resolves #4140

@findepi findepi changed the title Implement aggregation pushdown Implement aggregation pushdown in Pinot Nov 24, 2020
@findepi findepi added the enhancement New feature or request label Nov 24, 2020
@martint martint self-requested a review November 24, 2020 21:10
@findepi findepi self-requested a review January 27, 2021 13:01
@findepi
Copy link
Member

findepi commented Jan 27, 2021

@elonazoulay please rebase. See especially #6667 which changes JdbcTableHandle a bit and how agg pushdown is handled now.
This may ideally allow to remove or simplify PinotTableHandle and DynamicTable.

@elonazoulay elonazoulay force-pushed the agg_pushdown branch 2 times, most recently from 591c3b2 to 45ba425 Compare March 5, 2021 22:11
@elonazoulay
Copy link
Member Author

@findepi Sounds good! I will be removing the DynamicTableBuilder in an upcoming pull request.

@elonazoulay elonazoulay force-pushed the agg_pushdown branch 3 times, most recently from 9dc0553 to 754b5ab Compare April 19, 2021 12:28
.setCatalogSessionProperty("pinot", "enable_aggregation_pushdown", "true")
.build();
String noGroupingsQueryTemplate = "SELECT COUNT(*), SUM(%1$s), AVG(%1$s), MIN(%1$s), MAX(%1$s) FROM " + TOPIC_AND_TABLE;
assertQueryFails(format(noGroupingsQueryTemplate, "long_number"), "Segment query returned.*");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this is expected to fail? please add explanation.

(same below)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without aggregation pushdown enabled, the connector will query the servers directly. This is a way to test the limit for segment queries. A limit is enforced to avoid pinot servers crashing when too many large selects are run.

@findepi
Copy link
Member

findepi commented Apr 21, 2021

As i understand, this currently depends on #7627.
(Thanks for separating the PRs!)

@elonazoulay elonazoulay force-pushed the agg_pushdown branch 2 times, most recently from 7adff3c to 3515f68 Compare June 3, 2021 17:58
@xiangfu0 xiangfu0 self-requested a review June 3, 2021 21:04
@elonazoulay elonazoulay force-pushed the agg_pushdown branch 2 times, most recently from 0d90917 to e9189eb Compare June 7, 2021 23:42
@findepi
Copy link
Member

findepi commented Jun 8, 2021

@elonazoulay let me know whet this is ready for review. Also. it seems that

  • Fix mixed case column handling
  • Fix filter pushdown for segment queries
  • Guarantee limit for broker queries
  • Implement aggregation pushdown in Pinot

-- some commits could be run in separate PRs, which could perhaps reduce review overhead?

@elonazoulay
Copy link
Member Author

@findepi - this is ready, but to make it easier I will break it up into separate PR's, thanks!

findepi
findepi previously requested changes Jun 9, 2021
Comment on lines 347 to 368
if (inputExpression.isEmpty() || (isCountFunction(aggregate) && !isCountStarFunction(aggregate))) {
return Optional.empty();
}
// Distinct aggregations other than distinctcounthll are not supported
if (aggregate.isDistinct()) {
return Optional.empty();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be handled for individual functions separately. What would it take to use AggregateFunctionRewriter here?

Also, checking isDistinct is not enough.
See io.trino.plugin.jdbc.expression.AggregateFunctionPatterns#basicAggregation which matches only "basic" aggregation functions, without features that are both: easy to forget about and hard to implement in the remote system.

.map(PinotColumnHandle.class::cast)
.map(PinotColumnHandle::getColumnName)
.collect(toImmutableList()),
tableHandle.getQuery().flatMap(DynamicTable::getFilter),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in DynamicTable what are the semantics of filters, aggregations, sorting, etc?
Which is applied first, which is second, etc.?

Current ordering of fields suggest that aggregation comes before filter, so this line looks incorrect -- you are transposing pre-existing filter with an aggregation.

If this is "just" a matter of field ordering in DynamicTable, please update it and add commentary like here:

private final TupleDomain<ColumnHandle> constraint;
// semantically aggregation is applied after constraint
private final Optional<List<List<JdbcColumnHandle>>> groupingSets;
// semantically limit is applied after aggregation
private final OptionalLong limit;
// columns of the relation described by this handle, after projections, aggregations, etc.
private final Optional<List<JdbcColumnHandle>> columns;

When doing that, sure the ordering of fields matches that of constructor arguments, they are equally important!

Copy link
Member Author

@elonazoulay elonazoulay Jul 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed AggregationExpression and now the connector returns the exact pinot data type for aggregations. Also implemented approx_distinct pushdown.

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elonazoulay I resolved the addressed comments. I think I have an alternative proposal that would work with #8578 and not require the copying. PTAL if what I'm saying makes sense and is feasible to implement.

There's another option to remove count_distinct pushdown (which is the only rule which needs the grouping columns in rewrite context) from this PR so that it works with #8578 and address count_distinct in a follow-up.

projections.add(new Variable(pinotColumnHandle.getColumnName(), pinotColumnHandle.getDataType()));
resultAssignments.add(new Assignment(pinotColumnHandle.getColumnName(), pinotColumnHandle, pinotColumnHandle.getDataType()));
}
List<String> groupingColumns = getOnlyElement(groupingSets).stream()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elonazoulay IIUC the grouping columns in rewrite context are only needed to verify that the grouping columns are part of the SELECT clause. If so, can we instead do that validation here after the rewrite is done?

Yes, the connector will end up doing work that gets discarded but that seems cleaner because even if we add grouping columns to RewriteContext I don't see an easy way to populate them for other connectors so we won't be able to provide a sane impl. which doesn't return null for that method.

I think this is possible to do because here we have tableHandle available from which we can get the grouping columns in case of DynamicTable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, in a previous version of this pr it was done before the rewrite rules, I can add it back. Another idea: can RewriteContext.getGroupingColumns() return an Optional<List> as a default method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there some specific reason to change from the old implementation? Maybe I'm missing something. If it's feasible to do I'd like that.

As for the RewriteContext changes I'll defer to @martint (and Piotr once he's back). IMO if we have alternatives which are cleaner then I'd like to avoid the change. IMO the validation that grouping columns are part of select isn't exclusive to just count_distinct and makes sense to do before any rewrite.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Would another option be to pass the table handle into the rewrite context? I can get the grouping columns there also.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hashhar - I extracted the check for grouping columns and reverted the changes for AggregateFunctionRewriter and RewriteContext. What do you think about adding TableHandle to the RewriteContext instead of this approach?
Just in case I saved the previous version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the newer version. We can revisit changing RewriteContext once we have other non-JDBC connectors that implement agg pushdown. We'll have better information at that time to decide the correct abstraction.

@elonazoulay elonazoulay force-pushed the agg_pushdown branch 2 times, most recently from e0a146a to fdff656 Compare August 17, 2021 09:16
Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to go from my perspective % a question about behaviour of applyCountDistinct.

@hashhar
Copy link
Member

hashhar commented Aug 18, 2021

@elonazoulay I think you can rebase on top of #8578 and remove the copied classes where possible.
If it all builds and passes CI, I'll proceed with merging #8578 following which we can merge this change too.

@elonazoulay
Copy link
Member Author

@elonazoulay I think you can rebase on top of #8578 and remove the copied classes where possible.
If it all builds and passes CI, I'll proceed with merging #8578 following which we can merge this change too.

Will do, that's great news!!! Thanks @hashhar !

@elonazoulay elonazoulay force-pushed the agg_pushdown branch 6 times, most recently from 07baf9d to 53b968d Compare August 21, 2021 19:44
Reorder members and constructor parameters in order
of pushdown application.
Add a config to set broker query limit.
If the limit is set too high, pinot can oom:
apache/pinot#7110
This is a prepatory commit for aggregation pushdown
which sets values to null, otherwise pinot never
returns null values.
Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % question.

@findepi I think your comments are addressed. Can you PTAL? The major update is that this is now based on #8578 and the last commit (distinct count pushdown) is the one that has changed the most.

@elonazoulay
Copy link
Member Author

elonazoulay commented Aug 23, 2021

LGTM % question.

@findepi I think your comments are addressed. Can you PTAL? The major update is that this is now based on #8578 and the last commit (distinct count pushdown) is the one that has changed the most.
@hashhar - I added a comment in the code to explain this. Just so you have context: it is similar to the limit for segment queries behavior. Since you cannot know apriori how many rows a query returns, especially if it is pushed down by aggregation (turns into passthrough query), set the limit to one higher so the connector can check it and fail. This is similar to the behavior in the elastic search connector, It was suggested by @martint and has been implemented for segment queries already.

@hashhar hashhar dismissed findepi’s stale review August 24, 2021 08:01

stale, comments addressed

@hashhar hashhar merged commit c2c4f3f into trinodb:master Aug 24, 2021
@hashhar hashhar added this to the 361 milestone Aug 24, 2021
@hashhar
Copy link
Member

hashhar commented Aug 24, 2021

Thanks a lot for working on this @elonazoulay . ❤️

@hashhar hashhar mentioned this pull request Aug 24, 2021
10 tasks
@mattsfuller
Copy link
Member

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed enhancement New feature or request
Development

Successfully merging this pull request may close these issues.

Aggregation pushdown in Pinot connector
6 participants