-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Phoenix to replace partial topn with limit. #8171
Conversation
Shame on me for not running the Phoenix tests. |
Turns out for the tests I do need update the client. (I had only tested with real server instances). That means that for the phoenix 4.x code I can't make this change, since I cannot change the client to 4.16.1 as it is not guaranteed to be compatible with older 4.x version. It's currently using the 4.14.1 which forward compatible to later 4.x version. So new proposal. The Phoenix 4.x connector does not get this change. The Phoenix 5 connector get this change once 5.1.2 and I will change the Phoenix client version in the connector to 5.1.2 in a separate PR first. (5.1.2 is backward compatible with 5.1.1 and 5.1.0). I'll wait until 5.1.2 is out, update the client to that in separate, then rebase this one, and remove the change from the 4.x connector. |
73efe65
to
a2e65f1
Compare
@findepi Is there a way to leave the Phoenix dependency at 4.14.1, but for tests use 4.16.1 for the server part? (likely not worth the effort, but thought I'd check) |
Can we run Phoenix in a docker container? |
a2e65f1
to
3e70aef
Compare
Rebased after the change to Phoenix 5.1.2. @findepi Re: Docker. Probably. Lemme see if can make some for that. Until then we do this optimization (only that, not a correctness issue) for Phoenix 5 only. |
@@ -293,16 +292,7 @@ public boolean supportsTopN(ConnectorSession session, JdbcTableHandle handle, Li | |||
@Override | |||
protected Optional<TopNFunction> topNFunction() | |||
{ | |||
return Optional.of((query, sortItems, limit) -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it belongs in a different commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed so that topN is always handled by the connector. If the connector sometimes does it, and it is not apparent from the plan, I cannot do the push down.
That said, happy to break this into a separate commit, if that is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ttps://issues.apache.org/jira/browse/PHOENIX-6436 mentioned below is resolved (and shipped), we can address the TODO. It's fine to have this as a prep commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, happy to break this into a separate commit, if that is better.
please do
plugin/trino-phoenix5/src/main/java/io/trino/plugin/phoenix5/PhoenixMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-phoenix5/src/main/java/io/trino/plugin/phoenix5/PhoenixMetadata.java
Show resolved
Hide resolved
3e70aef
to
33ebd28
Compare
Enabled the SUPPORTS_TOPN_PUSHDOWN behavior in the test. Let's wait for a test run. |
Forgoet that SUPPORTS_TOPN_PUSHDOWN is different. |
33ebd28
to
ec91da4
Compare
plugin/trino-phoenix5/src/main/java/io/trino/plugin/phoenix5/PhoenixMetadata.java
Show resolved
Hide resolved
What I don't understand is why this test is now failing:
This test passed when I had the if(...isEmpty()) instead of the Optional.map change, which does not make any sense since it's functionally the same. |
The plan generated above is the right one, it looks like we'll need to override that test for phoenix as the behaviour of (topN pushdown + no topN limit guarantee + using sorting property) is a unique one. |
public ConnectorTableProperties getTableProperties(ConnectorSession session, ConnectorTableHandle table) | ||
{ | ||
JdbcTableHandle tableHandle = (JdbcTableHandle) table; | ||
List<LocalProperty<ColumnHandle>> sortingProperties = tableHandle.getSortOrder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could just do this in DefaultJdbcMetadata#getTableProperties
I don't see anything here unique to phoenix
Any other connector which can do TopN pushdown should automatically benefit in same way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One case where this would not be true in the phoenix (4) connector. The topN is pushed in the plan, but the connection still can decide not honor it. The same is true for some of the other connector (Postgres, MySQL), which also also may decide not to do the topN after the plan has decided to push this down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, doing it in DefaultJdbcMetadata just requires the assumption that the sortOrder in JdbcTableHandle will be guaranteed when the data is read from it, it's not required that the limit part of topN be guaranteed.
If that assumption also does not hold, then we can't move this to DefaultJdbcMetadata.
I'm not sure why we wouldn't drop the sortOrder from JdbcTableHandle if we weren't going to actually use that in forming the JDBC query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, in DefaultJdbcMetadata
it would need to depend on io.trino.plugin.jdbc.JdbcClient#isTopNLimitGuaranteed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@findepi is isTopNLimitGuaranteed
just about whether the limit part of topN is guaranteed by connector or does it impact the sorting part as well ?
If it's just about the limit, then I think don't need to worry about that, for this optimisation the main concern is whether the sorting part of TopN is guaranteed or not, we'll apply the limit part ourselves anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for now let's leave it in the Phoenix connector.
A special case is the Phoenix (4) connector, which overrides the topN function in some cases. That scenario would not work in general, right? As it would not even push the order by down.
Btw. order by pushdown (without limit) would be a useful feature anyway. Also see #8093, and it seems that I am missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for isTopNGuaranteed... Phoenix will silently not enforce a limit when the limit > 2^31. Does that mean now that topN is guaranteed or not?
Wait... Not true, it will throw an exception. In that case we can change both isLimitGuaranteed and isTopNGuaranteed to return true!
Arrgghh. So here's why isLimitGuaranteed is false: Trino will issue mutliple SQL statements to Phoenix, each of which has the limit. So it's guaranteed only per "partition".
Sorry for the back-and-forth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering what it would take to get this optimization into DefaultJdbcMetadata for cases where after TopN pushdown into TS, the result size is not strictly guaranteed (isTopNGuaranteed is false) but the ordering is.
@raunaqmorarka that could perhaps be useful in some cases (but currently JdbcMetadata lacks information to disntuigh such cases)
OTOH, it's useful for large data sets only (right?), which limits applicability of this, as JDBC connectors are not superfast anyway.
I think we should first make it work when isTopNGuaranteed
is true (simpler case, less changes) and log further improvement as a separate ticket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With a special case for Phoenix (5)...? (Where isTopGuaranteed is false, but we still want this optimization)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lhofhansl good point. If Phoenix 5 is most interesting to you & special case anyway, let's proceed with Phoenix, but let's define what we want to achieve in general, in an issue, so that's easier to follow up
ec91da4
to
2cf3a20
Compare
{ | ||
assertThat(query("SELECT orderkey FROM orders ORDER BY orderkey LIMIT 10")) | ||
.ordered() | ||
.isNotFullyPushedDown(TopNNode.class, ExchangeNode.class, ExchangeNode.class, LimitNode.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This passes the test. Not entirely sure whether it is still a useful test, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can include few more tests from BCT with modification here, like order by DESC, multiple different orders etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different from testing that TopN pushdown doesn't happen? (which is already tested in superclass if SUPPORTS_TOPN_PUSHDOWN is false). Or is the difference that the plan looks different (the retained nodes look same to me).
i.e. would it be better to add the new cases to superclass itself?
trino/plugin/trino-base-jdbc/src/test/java/io/trino/plugin/jdbc/BaseJdbcConnectorTest.java
Lines 626 to 631 in 41ff64c
if (!hasBehavior(SUPPORTS_TOPN_PUSHDOWN)) { | |
assertThat(query("SELECT orderkey FROM orders ORDER BY orderkey LIMIT 10")) | |
.ordered() | |
.isNotFullyPushedDown(TopNNode.class); | |
return; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the original plan:
Output[orderkey]
│ Layout: [orderkey:bigint]
└─ TopN[10 by (orderkey ASC NULLS LAST)]
│ Layout: [orderkey:bigint]
└─ LocalExchange[SINGLE] ()
│ Layout: [orderkey:bigint]
└─ RemoteExchange[GATHER]
│ Layout: [orderkey:bigint]
└─ TopNPartial[10 by (orderkey ASC NULLS LAST)]
│ Layout: [orderkey:bigint]
└─ TableScan[phoenix:tpch.orders TPCH.ORDERS sortOrder=[ORDERKEY:bigint:BIGINT ASC NULLS LAST] limit=10 columns=[ORDERKEY:bigint:BIGINT]]
Layout: [orderkey:bigint]
orderkey := ORDERKEY:bigint:BIGINT
So this indeed verifies that the topNPartial
was replaced by a limit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lhofhansl . Missed that the partial got removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would work.
Not sure what we are trying to test. The basic plan rewrite is done in #6634, here I just want to make that Phoenix is triggering that query rewrite. Assuming changes to #6634 would add tests, I would add the most generic version to the connectors, just making sure that the replacement actually happens.
As usually, I do not feel strongly :) Happy to add a more explicit test as you suggest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#6634 did add tests with a mock connector, hive and later on tpch connector
Here we want to test that after partial pushdown of topN into table scan takes place, we detect that through the sorting ordering in the JDBC table handle and expose the right ConnectorTableProperties for it. This part is unique to this connector. We can assume that if the right properties are set, then the planner will do the expected query rewrite. The plan assertions are an indirect but convenient way to test that the right sorting properties were set.
PlanMatchPattern would allow us to explicitly verify the preSortedInputs
in LimitNode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only way I can get this to work:
assertThat(query("SELECT orderkey FROM orders ORDER BY orderkey LIMIT 10"))
.ordered()
.matches(
output(
node(TopNNode.class,
anyTree(
node(LimitNode.class,
node(TableScanNode.class))))));
Not sure that is much better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make that slightly better by building more specific plan node patterns rather than just using Node.class, e.g. see the plan patterns used in assertions in TestPartialTopNWithPresortedInput#testWithSortedTable
.
However, to me this is nice to have, not a blocker to merging the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I had tried the more specific pattern, but in the time I had I couldn't get the plan to match. TestPartialTopNWithPresortedInput
is a great example, though. I'll try that.
plugin/trino-phoenix5/src/test/java/io/trino/plugin/phoenix5/TestPhoenixConnectorTest.java
Outdated
Show resolved
Hide resolved
2cf3a20
to
95aee98
Compare
plugin/trino-phoenix5/src/test/java/io/trino/plugin/phoenix5/TestPhoenixConnectorTest.java
Show resolved
Hide resolved
plugin/trino-phoenix5/src/test/java/io/trino/plugin/phoenix5/TestPhoenixConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-phoenix5/src/test/java/io/trino/plugin/phoenix5/TestPhoenixConnectorTest.java
Show resolved
Hide resolved
{ | ||
assertThat(query("SELECT orderkey FROM orders ORDER BY orderkey LIMIT 10")) | ||
.ordered() | ||
.isNotFullyPushedDown(TopNNode.class, ExchangeNode.class, ExchangeNode.class, LimitNode.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can include few more tests from BCT with modification here, like order by DESC, multiple different orders etc.
95aee98
to
3ed73a0
Compare
I think this is good to go (I added more tests and addressed the review comments). Let's file a general issue for JDBC. I do agree with @findepi that for most JDBC connectors this will nor be a substantive improvement. Phoenix is special here as it actually generates splits and executes these again many nodes in the Phoenix/HBase cluster - so the extra, unnecessary topNPartial can have a significant impact. (Next: Push down ORDER BY). |
Hi... Just making sure this is not waiting on anything from me. |
Ping ... :) |
3ed73a0
to
102ada0
Compare
Thinking about how I'd be using BasePlanTest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few comments
plugin/trino-phoenix5/src/test/java/io/trino/plugin/phoenix5/TestPhoenixConnectorTest.java
Outdated
Show resolved
Hide resolved
{ | ||
assertThat(query("SELECT orderkey FROM orders ORDER BY orderkey LIMIT 10")) | ||
.ordered() | ||
.isNotFullyPushedDown(TopNNode.class, ExchangeNode.class, ExchangeNode.class, LimitNode.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lhofhansl Using isNotFullyPushedDown
for asserting plan shape when it has nothing to do with pushdown seems a bit dirty.
Using something similar to what Piotr shows above is more explicit in the intent of the test.
@hashhar I agree in isNofFullyPushedDown, felt the same way. Thinking more about it... Perhaps I can add more functions with better names to QueryAssertions? What do you think @hashhar ? |
2b5cd76
to
4faff2f
Compare
I added SkipException to the test. As for BasePlanTests... I am still looking into that. Looks like all existing such tests are for Hive...? |
Can we merge with the current tests? (as I mentioned above, they do test what we want: That topN is actually not pushed down, but that instead the plan is modified) |
4faff2f
to
0c7bb96
Compare
Rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % comments.
Sorry to keep you waiting @lhofhansl .
plugin/trino-phoenix5/src/test/java/io/trino/plugin/phoenix5/TestPhoenixConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-phoenix5/src/test/java/io/trino/plugin/phoenix5/TestPhoenixConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-phoenix5/src/test/java/io/trino/plugin/phoenix5/TestPhoenixConnectorTest.java
Outdated
Show resolved
Hide resolved
Hey @hashhar , no problem at all. I'll address your comments tomorrow. |
0c7bb96
to
56b05f5
Compare
testing/trino-testing/src/main/java/io/trino/testing/BaseConnectorTest.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks @lhofhansl.
@raunaqmorarka @sopel39 PTAL if you want to.
Minor comment about looking into https://github.com/trinodb/trino/pull/8171/files#r665088423 |
tableName); | ||
assertUpdate(createTableSql, 1500L); | ||
|
||
String expected = "SELECT custkey FROM customer ORDER BY 1 NULLS FIRST LIMIT 100"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One doubt here, assuming that phoenix will honour whatever nulls sorting order Trino will pass down in the jdbc query to it, we can expect this to work even without the NULLS FIRST
part right ?
The hive case is a bit different as there the sort order of NULLS FIRST
is fixed when the sorted files are written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It will just be slower on the Phoenix side if there happens to an index on the sort column(s), The index can only be used with NULL FIRST. So for correctness this is not needed.
56b05f5
to
6a8b0a6
Compare
This one works with explicit plan matching. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the updated tests lgtm
@@ -320,21 +335,92 @@ public void testTopNPushdown() | |||
@Test | |||
public void testReplacePartialTopNWithLimit() | |||
{ | |||
List<PlanMatchPattern.Ordering> orderBy = ImmutableList.of(sort("orderkey", ASCENDING, LAST)); | |||
|
|||
assertThat(query("SELECT orderkey FROM orders ORDER BY orderkey LIMIT 10")) | |||
.ordered() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need ordered
now
6a8b0a6
to
544ece1
Compare
Restructured commits to add QueryAssertions.matches as separate commit. |
544ece1
to
6f82f2f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thank you. |
Adds Phoenix support for #6634 to Phoenix
This requires Phoenix 4.16.1 and Phoenix 5.1.2 on the server side to avoid memory overestimation.
4.16.1 is out, Phoenix 5.1.2 will be released soon, this PR should wait until that release.
No Phoenix client updates are needed.