Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query improvement about the query partition table #1441

Open
Tracked by #1434
ShiKaiWi opened this issue Jan 15, 2024 · 2 comments
Open
Tracked by #1434

Query improvement about the query partition table #1441

ShiKaiWi opened this issue Jan 15, 2024 · 2 comments
Assignees
Labels
feature New feature or request

Comments

@ShiKaiWi
Copy link
Member

ShiKaiWi commented Jan 15, 2024

Describe This Problem

Considering the query targeting at a partition table whose hash partition key is called partition_col:
select * from partition_table where partition_col in ("a", "b", "c", ...).

And all the sub query plans share the same predicate, and if the inlist is large, the min-max and bloom-filter index may exhibit a very bad performance. However, actually, most of the values in the inlist don't exist at one specific partition, that is to say, the predicate in the sub query plan can be simplified into a more simple one.

Proposal

Introduce an optimization procedure to remove the unnecessary values in the in-list predicate of the distributed sub query plan.

More implementation details are necessary before coding.

Additional Context

No response

@ShiKaiWi ShiKaiWi added the feature New feature or request label Jan 15, 2024
@jiacai2050
Copy link
Contributor

jiacai2050 commented Nov 1, 2024

@zealchen Are you interested in this?

@zealchen
Copy link
Contributor

zealchen commented Nov 5, 2024

@zealchen Are you interested in this?

Yes. Let me handle it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants