Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LIKE predicate pushdown #1580

Open
infearOnTheWay opened this issue Sep 25, 2019 · 12 comments
Open

LIKE predicate pushdown #1580

infearOnTheWay opened this issue Sep 25, 2019 · 12 comments

Comments

@infearOnTheWay
Copy link

infearOnTheWay commented Sep 25, 2019

'Like' predicate has been supported in presto.But it seems that 'Like' can't push down to the data source as far as I know. Since the 'ValueSet' only contains 'EquatableValueSet','SortedRangeSet' and 'AllOrNoneValueSet' implementation now.

In our case, 'Like' predicate would have better performance in our data source query process.So we implement 'LikeValuableSet' and push down to our data source.It realy works well as expected.

In our opinion,it seems to be a general demand and useful supplement in current presto. May be we can do some detailed discussion on this topic.

Thanks.

@findepi
Copy link
Member

findepi commented Sep 25, 2019

@infearOnTheWay I agree that this is an important thing to be able to push down LIKE (or selected function calls in general)
The biggest problem, however, is how to do this

  • provide necessary APIs
  • guarantee backwards compatibility
  • be still able to evolve the code (without breaking compatibility)

This makes the idea much harder to implement properly.
This is generally tracked by #18.
Please join conversation on the #pushdown channel on the Presto Slack.

@findepi findepi changed the title Enhance 'Valuableset' by 'Like' predicate LIKE predicate pushdown Sep 25, 2019
@infearOnTheWay
Copy link
Author

@findepi Thanks for your reply.
I agree with your opinion.
Maybe we can start by a simplified version and do improvement step by step.Since we have done this in our project and it seems work well so far. So shall we try to pull a request about this topic?We are very appreciated about your advice on our implementation.

@Praveen2112
Copy link
Member

Praveen2112 commented Sep 29, 2019

@infearOnTheWay But how do we generate TupleDomain for an expression like this columnA like '%b%' or columnA = 'abcde' in this we can't pushdown those expression. Plus I think we are going to replace (in the near future) TupleDomain with ConnectorExpression using which we can pushdown function call and like predicate

@infearOnTheWay
Copy link
Author

infearOnTheWay commented Sep 30, 2019

@Praveen2112
In my opinion, the expression columnA like '%b%' or columnA = 'abcde' can't be pushdown since here use **or** condition. Even if Like pushdown Is supported,this problem still exists. But the expression like columnA like '%b%' **and** columnA = 'abcde' would benefit since both predicate can be pushdown.
In our implementation, we extend the ValueSet by add LikeValuableSet and consider the intersection between Like and other predicates.
BTW, the ConenctorExpression upgrade you mentioned seems very interesting. Any issue to trace now?
Thanks.

@Praveen2112
Copy link
Member

@infearOnTheWay IIRC **or** on same column can be pushed down to the connectors (only if it is on a different columns it is not being pushed).

So for this LikeValuableSet maintains both the LIke and other predicates when they are intersected or do we have a separate implementation which maintains both type of the ValueSet as its field ?

@infearOnTheWay
Copy link
Author

@Praveen2112 Yes, you are right. I made a mistake about the push down conditions.As you said, it is not being pushed only it is on a different column.
On the other hand, we have a seperate implemention LikeValueableSet which support 'Like' predicate. And we add a extra method intersectConsiderLikePredicate which is similar with intersect in TupleDomain as a special deal with the Like predicate.
Now we are organizing our implementation. May be it would be more helpful for discussion when any code segments are presented~

@kokosing
Copy link
Member

kokosing commented Mar 1, 2021

Another option, that we already are using for varchar predicate pushdown in to pushdown the predicate and keep the predicate on Trino side. So for predicate like 'x LIKE 'prefix%'we could pushdownx >= 'prefixandx < 'prefi[x+1](depending on collation) and keepx LIKE 'prefix%'` on Trino side. So we could limit amount of data that is read, but:

  • it does not work well with other pushdowns
  • it is vulnerable to collation differences between Trino and remote data source

Such change could be implemented as optimizer rule without changing TupleDomain.

@findepi
Copy link
Member

findepi commented Mar 1, 2021

So for predicate like 'x LIKE 'prefix%'we could pushdownx >= 'prefixandx < 'prefi[x+1]

@kokosing this is already implemented.

@kokosing
Copy link
Member

kokosing commented Mar 2, 2021

I don't see this happening. I was testing JDBC connector (PostgreSQL) and I see that io.trino.plugin.jdbc.JdbcMetadata#applyFilter with TupleDomain=ALL.

@findepi Do you refer Hive connector and filtering partitions with io.trino.spi.connector.Constraint#predicate?

@findepi
Copy link
Member

findepi commented Mar 2, 2021

@kokosing See io.trino.sql.planner.DomainTranslator.Visitor#visitLikePredicate.
I don't know why it wouldn't work for PostgreSQL, needs checking.

@findepi
Copy link
Member

findepi commented Mar 5, 2021

I verified this with PostgreSQL and this worked correctly for me. Of course, the end query is not going to have predicates retained as per #7145, unless unsafe pushdown (#7022) is enabled.

@findepi
Copy link
Member

findepi commented Mar 4, 2022

full LIKE pushdown was added for PostgreSQL in #11045
other connectors need to follow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants