Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TableScan with filters pushdown unparsing (multiple filters) #13131

Merged

Conversation

sgrebnov
Copy link
Member

@sgrebnov sgrebnov commented Oct 26, 2024

Which issue does this PR close?

With filters pushdown optimization, the LogicalPlan can have filters defined as part of TableScan and Filter nodes.
To avoid overwriting one of the filters, we combine the existing filter with the additional filter.

Example query

select
        c_phone as cntrycode,
        c_acctbal
from
        customer
where c_mktsegment = 'BUILDING' and c_acctbal > (
        select
                avg(c_acctbal)
        from
                customer);

Logical Plan

|  Projection: customer.c_phone AS cntrycode, customer.c_acctbal                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
|   Filter: CAST(customer.c_acctbal AS Decimal128(38, 6)) > (<subquery>)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
|     Subquery:
|     ..                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
|     TableScan: customer, full_filters=[customer.c_mktsegment = Utf8("BUILDING")]

W/o this change it will be unparsed as

select
        c_phone as cntrycode,
        c_acctbal
from
        customer
where c_mktsegment = 'BUILDING'

What changes are included in this PR?

Improves QueryBuilder pub fn selection to combine filters if select is called multiple times (select corresponds to query filter / WHERE clause).

Are these changes tested?

Added unit test, tested as part of TPC-H and TPC-DS queries unparsing by https://github.com/spiceai/spiceai (running benchmarks with some filters pushdown optimizations enabled)

Are there any user-facing changes?

Fixes some unparsing issues related to missing WHERE clauses when running TPC-H and TPC-DS queries with filters pushdown optimization enabled

@github-actions github-actions bot added the sql SQL Planner label Oct 26, 2024
@sgrebnov sgrebnov force-pushed the sgrebnov/impove-where-unparsing-upstream branch from 01e04af to 4263a54 Compare October 27, 2024 04:59
Copy link
Contributor

@goldmedal goldmedal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sgrebnov it looks good to me 👍

@goldmedal goldmedal merged commit 5db2740 into apache:main Oct 27, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants