-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support range predicate pushdown for string columns with collation in PostgreSQL connector #9746
Conversation
@@ -37,7 +37,6 @@ | |||
import io.trino.plugin.jdbc.LongWriteFunction; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update io.trino.plugin.postgresql.TestPostgreSqlConnectorTest#hasBehavior
to support SUPPORTS_PREDICATE_PUSHDOWN_WITH_VARCHAR_INEQUALITY
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The impl looks good. Please adjust the declaration in tests to make sure the tests pass with the new assumption.
69077fa
to
968b6cd
Compare
@@ -81,13 +76,16 @@ public void setExtensions() | |||
protected boolean hasBehavior(TestingConnectorBehavior connectorBehavior) | |||
{ | |||
switch (connectorBehavior) { | |||
case SUPPORTS_PREDICATE_PUSHDOWN_WITH_VARCHAR_INEQUALITY: | |||
case SUPPORTS_JOIN_PUSHDOWN_WITH_VARCHAR_INEQUALITY: | |||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional updates are necessary to support aggregation pushdown for varchar columns. Since I would focus on support range predicates pushdown in this pull request, I just introduced a new behavior that indicates whether aggregation pushdown is supported for varchar columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's create a GitHub issue and add a TODO so that we can remove the additional behaviour once it's no longer needed (and other people know that this is something they can work on).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense (as long as this fix can be merged) and I would work on it once this pull request is completed. By the way, supporting type sensitive aggregation pushdown in JDBC plugins doesn't seem easy. Fundamental interface changes may be required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I ran into some similar issues in #7320.
There were some ideas floated in #7320 (comment) (which we didn't end up doing since it looked like a one-off need at that time).
If you already have some direction in your mind it might be helpful to discuss it on #dev on Slack too (if you think the changes will be large and touch the SPI).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, will take a look at #7320 first. Thanks.
968b6cd
to
ac742f8
Compare
I'm trying to find a way to fix it. |
Looks like collations have been supported by |
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
dc45a73
to
d9c1e3a
Compare
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
testing/trino-testing/src/main/java/io/trino/testing/TestingConnectorBehavior.java
Outdated
Show resolved
Hide resolved
tableHandle -> ((JdbcTableHandle) tableHandle).getConstraint().isAll(), | ||
TupleDomain.all(), | ||
ImmutableMap.of()))); | ||
PlanMatchPattern.node(TableScanNode.class))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why still isNotFullyPushedDown
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the domain compaction, the original predicate seems to remain on Trino side although the compacted predicate is pushed down to PostgreSQL:
SQL on PostgreSQL:
SELECT "nationkey", "name", "regionkey" FROM "tpch"."nation" WHERE ("name" >= ? COLLATE "C" AND "name" <= ? COLLATE "C")
Plan on Trino:
Output[regionkey, nationkey, name]
│ Layout: [regionkey:bigint, nationkey:bigint, name:varchar(25)]
└─ RemoteExchange[GATHER]
│ Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint]
└─ ScanFilter[table = postgresql:tpch.nation tpch.nation constraint on [name] columns=[nationkey:bigint:int8, name:varchar(25):varchar, regionkey:bigint:int8], filterPredicate = ("name" IN (CAST('POLAND' AS varchar(25)), CAST('ROMANIA' AS varchar(25)), CAST('VIETNAM' AS varchar(25))))]
Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint]
nationkey := nationkey:bigint:int8
regionkey := regionkey:bigint:int8
name := name:varchar(25):varchar
...n/trino-postgresql/src/test/java/io/trino/plugin/postgresql/TestPostgreSqlConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Show resolved
Hide resolved
@@ -81,13 +76,16 @@ public void setExtensions() | |||
protected boolean hasBehavior(TestingConnectorBehavior connectorBehavior) | |||
{ | |||
switch (connectorBehavior) { | |||
case SUPPORTS_PREDICATE_PUSHDOWN_WITH_VARCHAR_INEQUALITY: | |||
case SUPPORTS_JOIN_PUSHDOWN_WITH_VARCHAR_INEQUALITY: | |||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's create a GitHub issue and add a TODO so that we can remove the additional behaviour once it's no longer needed (and other people know that this is something they can work on).
9908b52
to
ee99698
Compare
@findepi Finished updating for now. Could you take a look again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments.
Can we detect (or apply some heuristics) to determine during runtime whether a given predicate needs the COLLATION applied or not? For example for equality predicates adding the collation will lead to a definite performance regression because no indexes can be used anymore.
If it's not possible to do dynamically during runtime then let's add an experimental.
prefixed config property to enable this behaviour and keep it opt in for now?
WDYT @wendigo ?
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
...n/trino-postgresql/src/test/java/io/trino/plugin/postgresql/TestPostgreSqlConnectorTest.java
Show resolved
Hide resolved
...n/trino-postgresql/src/test/java/io/trino/plugin/postgresql/TestPostgreSqlConnectorTest.java
Outdated
Show resolved
Hide resolved
I think it's possible in some cases but 100% is impossible. But anyway it should be better than full-scan?
I'm fine to make this optional as long as we can enable this by configuration. Is adding |
@takezoe You can add a session property with matching config property in cc: @findepi @kokosing @wendigo @ebyhr Any opinions? I'd prefer to have this behind a configuration toggle (at-least for now) until we can verify that there isn't a performance concern in actual practical usage? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good % confirmation about config from other maintainers.
...n/trino-postgresql/src/test/java/io/trino/plugin/postgresql/TestPostgreSqlConnectorTest.java
Outdated
Show resolved
Hide resolved
Looks like trino/plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlConfig.java Line 40 in f7aac0d
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % comments.
Mostly about possible simplifications.
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlConfig.java
Outdated
Show resolved
Hide resolved
587edbe
to
dbddd0a
Compare
28313bd
to
c59edbe
Compare
@hashhar Finished updating the pull request. Could you take a look again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me % comment. Thanks for working on this.
Also it looks the first two commits should be squashed together.
plugin/trino-postgresql/src/test/java/io/trino/plugin/postgresql/PostgreSqlQueryRunner.java
Outdated
Show resolved
Hide resolved
005f947
to
c5646cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...n/trino-postgresql/src/test/java/io/trino/plugin/postgresql/TestPostgreSqlConnectorTest.java
Show resolved
Hide resolved
Looks like For example, having a flag as a field and changing its value inside Does splitting |
c5646cc
to
0721fe8
Compare
I adopted this way for now and added a case for join on VARCHAR columns. |
e5e96e6
to
7e5ae28
Compare
Since only PostgreSQL has this behaviour today (and not enabled by default) I think it's fine to not add a new behaviour to In terms of future evolution I think once we prove it out with PostgreSQL we can lift the config to BaseJdbcConfig and then add a test in BaseJdbcConnectorTest which sets the session property - at that time we can revisit the best way to structure the |
@hashhar sounds good. except i am not convinced we need a config just yet, except for trying things out (a temporary kill switch) |
7e5ae28
to
0a1f394
Compare
Ah, I see. That makes sense. I updated the test case. |
6fd6890
to
16b3c0c
Compare
16b3c0c
to
5eb794a
Compare
Closing and re-opening to get ci to run (some glitch on GitHub end). |
Unrelated failure. Thanks @takezoe for the feature. Merging it. |
No description provided.