Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Phoenix to replace partial topn with limit. #8171
Allow Phoenix to replace partial topn with limit. #8171
Changes from all commits
c1cf825
6f82f2f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could just do this in DefaultJdbcMetadata#getTableProperties
I don't see anything here unique to phoenix
Any other connector which can do TopN pushdown should automatically benefit in same way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One case where this would not be true in the phoenix (4) connector. The topN is pushed in the plan, but the connection still can decide not honor it. The same is true for some of the other connector (Postgres, MySQL), which also also may decide not to do the topN after the plan has decided to push this down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, doing it in DefaultJdbcMetadata just requires the assumption that the sortOrder in JdbcTableHandle will be guaranteed when the data is read from it, it's not required that the limit part of topN be guaranteed.
If that assumption also does not hold, then we can't move this to DefaultJdbcMetadata.
I'm not sure why we wouldn't drop the sortOrder from JdbcTableHandle if we weren't going to actually use that in forming the JDBC query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, in
DefaultJdbcMetadata
it would need to depend onio.trino.plugin.jdbc.JdbcClient#isTopNLimitGuaranteed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@findepi is
isTopNLimitGuaranteed
just about whether the limit part of topN is guaranteed by connector or does it impact the sorting part as well ?If it's just about the limit, then I think don't need to worry about that, for this optimisation the main concern is whether the sorting part of TopN is guaranteed or not, we'll apply the limit part ourselves anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for now let's leave it in the Phoenix connector.
A special case is the Phoenix (4) connector, which overrides the topN function in some cases. That scenario would not work in general, right? As it would not even push the order by down.
Btw. order by pushdown (without limit) would be a useful feature anyway. Also see #8093, and it seems that I am missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for isTopNGuaranteed... Phoenix will silently not enforce a limit when the limit > 2^31. Does that mean now that topN is guaranteed or not?Wait... Not true, it will throw an exception. In that case we can change both isLimitGuaranteed and isTopNGuaranteed to return true!
Arrgghh. So here's why isLimitGuaranteed is false: Trino will issue mutliple SQL statements to Phoenix, each of which has the limit. So it's guaranteed only per "partition".
Sorry for the back-and-forth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@raunaqmorarka that could perhaps be useful in some cases (but currently JdbcMetadata lacks information to disntuigh such cases)
OTOH, it's useful for large data sets only (right?), which limits applicability of this, as JDBC connectors are not superfast anyway.
I think we should first make it work when
isTopNGuaranteed
is true (simpler case, less changes) and log further improvement as a separate ticketThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With a special case for Phoenix (5)...? (Where isTopGuaranteed is false, but we still want this optimization)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lhofhansl good point. If Phoenix 5 is most interesting to you & special case anyway, let's proceed with Phoenix, but let's define what we want to achieve in general, in an issue, so that's easier to follow up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One doubt here, assuming that phoenix will honour whatever nulls sorting order Trino will pass down in the jdbc query to it, we can expect this to work even without the
NULLS FIRST
part right ?The hive case is a bit different as there the sort order of
NULLS FIRST
is fixed when the sorted files are written.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It will just be slower on the Phoenix side if there happens to an index on the sort column(s), The index can only be used with NULL FIRST. So for correctness this is not needed.