-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] Support SimplifyWithGuarantee for is_in expressions #43187
Comments
larry98
changed the title
Support SimplifyWithGuarantee for is_in expressions
[C++] Support SimplifyWithGuarantee for is_in expressions
Jul 8, 2024
@larry98 feel free to do this and @ me if you finish This requires some enhancement on |
bkietz
added a commit
that referenced
this issue
Sep 10, 2024
### Rationale for this change Prior to #43256, this PR adds a basic implementation that does a linear scan filter over the value set on each guarantee. This isolates the correctness/semantics of `is_in` predicate simplification from the binary search performance optimization. ### What changes are included in this PR? `SimplifyWithGuarantee` now handles `is_in` expressions. ### Are these changes tested? A new unit test was added to arrow-compute-expression-test testing this change. ### Are there any user-facing changes? No. * GitHub Issue: #43187 Lead-authored-by: Larry Wang <[email protected]> Co-authored-by: larry98 <[email protected]> Co-authored-by: Benjamin Kietzman <[email protected]> Signed-off-by: Benjamin Kietzman <[email protected]>
Issue resolved by pull request 43761 |
khwilson
pushed a commit
to khwilson/arrow
that referenced
this issue
Sep 14, 2024
…pache#43761) ### Rationale for this change Prior to apache#43256, this PR adds a basic implementation that does a linear scan filter over the value set on each guarantee. This isolates the correctness/semantics of `is_in` predicate simplification from the binary search performance optimization. ### What changes are included in this PR? `SimplifyWithGuarantee` now handles `is_in` expressions. ### Are these changes tested? A new unit test was added to arrow-compute-expression-test testing this change. ### Are there any user-facing changes? No. * GitHub Issue: apache#43187 Lead-authored-by: Larry Wang <[email protected]> Co-authored-by: larry98 <[email protected]> Co-authored-by: Benjamin Kietzman <[email protected]> Signed-off-by: Benjamin Kietzman <[email protected]>
zeroshade
pushed a commit
to zeroshade/arrow
that referenced
this issue
Sep 30, 2024
…pache#43761) ### Rationale for this change Prior to apache#43256, this PR adds a basic implementation that does a linear scan filter over the value set on each guarantee. This isolates the correctness/semantics of `is_in` predicate simplification from the binary search performance optimization. ### What changes are included in this PR? `SimplifyWithGuarantee` now handles `is_in` expressions. ### Are these changes tested? A new unit test was added to arrow-compute-expression-test testing this change. ### Are there any user-facing changes? No. * GitHub Issue: apache#43187 Lead-authored-by: Larry Wang <[email protected]> Co-authored-by: larry98 <[email protected]> Co-authored-by: Benjamin Kietzman <[email protected]> Signed-off-by: Benjamin Kietzman <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
We'd like to use parquet predicate pushdown on
is_in
expressions, but this currently isn't supported inSimplifyWithGuarantee
. We implemented a proof of concept where we sort and deduplicate theis_in
expression's value set, then haveSimplifyWithGuarantee
binary search on the inequality bound and slice the value set accordingly. This works well, but I'm not sure what the correct interface for enabling this code path should be. Our current approach adds a new field toSetLookupOptions
which allows the user to declare whether the value set is pre-sorted and deduplicated.Any thoughts? I'd be happy to put up a PR if we agree on an interface.
Component(s)
C++
The text was updated successfully, but these errors were encountered: