Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DataFusion to h2oai/db-benchmark #147

Closed
alamb opened this issue Apr 26, 2021 · 34 comments
Closed

Add DataFusion to h2oai/db-benchmark #147

alamb opened this issue Apr 26, 2021 · 34 comments
Labels
datafusion Changes in the datafusion crate help wanted Extra attention is needed

Comments

@alamb
Copy link
Contributor

alamb commented Apr 26, 2021

Note: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-11252

I would like to see DataFusion added to h2oai/db-benchmark so that we can see how we compare to other solutions (including Pandas, Spark, cuDF, and Polars).

Since Polars (another Rust DataFrame library that uses Arrow) has already been added, I am hoping that we can learn from their scripts.

There is an issue filed against db-benchmark for adding DataFusion:

h2oai/db-benchmark#107

@alamb alamb added the datafusion Changes in the datafusion crate label Apr 26, 2021
@houqp houqp added the help wanted Extra attention is needed label Oct 18, 2021
@matthewmturner
Copy link
Contributor

@alamb @houqp FYI i am picking up on the work @Dandandan started on this.

below are the current results i have after adding join queries:

group by
q1 took 56 ms
q2 took 289 ms
q3 took 1305 ms
q4 took 69 ms
q5 took 1158 ms
q7 took 1198 ms
q10 took 24691 ms

join
q1 took 261 ms
q2 took 367 ms
q3 took 334 ms
q4 took 507 ms
q5 took 1936 ms

@Dandandan
Copy link
Contributor

@alamb @houqp FYI i am picking up on the work @Dandandan started on this.

below are the current results i have after adding join queries:

group by
q1 took 56 ms
q2 took 289 ms
q3 took 1305 ms
q4 took 69 ms
q5 took 1158 ms
q7 took 1198 ms
q10 took 24691 ms

join
q1 took 261 ms
q2 took 367 ms
q3 took 334 ms
q4 took 507 ms
q5 took 1936 ms

Thank you very much @matthewmturner. From looking at your results and extrapolating a bit from my earlier benchmarking and published results it seems like DF does very well on the join queries. Hopefully we can do some real comparisons later.

@jon-chuang
Copy link

jon-chuang commented Dec 25, 2021

@matthewmturner I'm assuming that's on the 0.5 GB bench?

It does seem to be a little lacking for group by in comparison to polars and others.

@matthewmturner
Copy link
Contributor

@matthewmturner I'm assuming that's on the 0.5 GB bench?

It does seem to be a little lacking for group by in comparison to polars and others.

Yes. I still need to double check table setup and queries to make sure it's apples to apples though - so take it with a grain of salt for the moment.

@Dandandan
Copy link
Contributor

I also see the code which used to load data into partitions using Mentale::load now loads it into 1 partition (with try_new). Not sure if the difference is still big as we now do Round Robin repartitioning of the data, but would maybe still save a bit.

@matthewmturner
Copy link
Contributor

@Dandandan my understanding of what i did was set the expected partitions at the ExecutionContext / ExecutionConfig level and then collect that into record batches from a DataFrame with df.collect_partitioned which would take into account the target partitions. That being said i had to play around with that a lot to get a MemTable so im not confident its doing what i expected. I'll look into it a little more though.

If what i did was incorrect is there a more idiomatic way to create a MemTable with the desired number of partitions?

Separately, i havent been able to find docs on the supported math functions. Any info you could provide on that?

@matthewmturner
Copy link
Contributor

Went through the code and found the math functions that are supported. I think it would be nice to add documentation to the datafusion site on what's supported. But of course that is a separate topic - i've created an issue for it.

I think for the more advanced group by queries we'll need to add median, standard dev, and correlation functions. ive created an issue for adding those as well - but hopefully we can submit benchmark without those and add them when the functionality is added.

i think we'll be able to add query 8, going to work on that.

@alamb
Copy link
Contributor Author

alamb commented Dec 26, 2021

I think for the more advanced group by queries we'll need to add median, standard dev, and correlation functions. ive created an issue for adding those as well - but hopefully we can submit benchmark without those and add them when the functionality is added.

FWIW standard deviation and correlation can be calculated using the existing aggregation functions (aka AVG(X) and AVG(X^2)), numerical precision issues not withstanding

Median is harder -- I think it will need special casing as it can't be calculated using partial aggregates

@matthewmturner
Copy link
Contributor

@alamb yes agree. The median is needed in same query with std deviation but im going to try adding the one with correlation

@matthewmturner
Copy link
Contributor

matthewmturner commented Dec 26, 2021

I feel a bit silly for asking this, but is the ability to raise a value / column to a power implemented in datafusion? I've tried the below ways with no luck. Am I missing something obvious?

DataFusion CLI v5.1.0

❯ select 2**2;  🤔 Invalid statement: sql parser error: Expected end of statement, found: 2
DataFusion CLI v5.1.0

❯ select 2^2;
NotImplemented("Unsupported SQL binary operator BitwiseXor")
❯ select power(2,2);
Plan("Invalid function 'power'")

I get the same errors when running on table columns

@matthewmturner
Copy link
Contributor

matthewmturner commented Dec 26, 2021

im also curious what people think about using the python bindings instead of rust? there are existing utilities we could leverage if so. however, can we generate an optimized build for python like we are in rust?

@houqp @alamb

@alamb
Copy link
Contributor Author

alamb commented Dec 27, 2021

I feel a bit silly for asking this, but is the ability to raise a value / column to a power implemented in datafusion? I've tried the below ways with no luck. Am I missing something obvious?

I thought it was pow or power but when I tried to dig around I didn't find it implemented. I filed #1493 to track

@alamb
Copy link
Contributor Author

alamb commented Dec 27, 2021

im also curious what people think about using the python bindings instead of rust? there are existing utilities we could leverage if so. however, can we generate an optimized build for python like we are in rust?

In general, I think using the python bindings is a great idea for integration of datafusion with other systems. I don't know how very much about how to build / use them, but I would love to see more documentation on the process :)

@matthewmturner
Copy link
Contributor

I finished first draft at python bindings for the group by suite (https://github.com/matthewmturner/db-benchmark/blob/datafusion/datafusion/groupby-datafusion.py)

Feedback welcome :)

@Dandandan
Copy link
Contributor

im also curious what people think about using the python bindings instead of rust? there are existing utilities we could leverage if so. however, can we generate an optimized build for python like we are in rust?

@houqp @alamb

Python bindings are also a good idea.

There are a bit of optimizations not possible with the python bindings, so I would expect it to be a bit slower.

@matthewmturner
Copy link
Contributor

@Dandandan for my information - which optimizations arent possible?

@Dandandan
Copy link
Contributor

@Dandandan for my information - which optimizations arent possible?

I meant, when using the published datafusion bindings, those optimizations are not possible or at least require some more work, that the native rust version does:

  • Compiling for the specific target CPU with simd feature enabled
  • Using a custom allocator
  • Full LTO

@alippai
Copy link
Contributor

alippai commented Dec 27, 2021

I believe @ritchie46 has all these enabled in his python bindings

@matthewmturner
Copy link
Contributor

For the avoidance of doubt, do we all agree that the Python solution will be the only solution submitted (at least for now)?

@ritchie46
Copy link
Contributor

I believe @ritchie46 has all these enabled in his python bindings

Yes, only not a specific target cpu ofcourse. But LTO and SIMD work fine.

@alamb
Copy link
Contributor Author

alamb commented Dec 28, 2021

For the avoidance of doubt, do we all agree that the Python solution will be the only solution submitted (at least for now)?

I think so -- and of improve the DataFusion python bindings in the process so much the better 👍

@Dandandan
Copy link
Contributor

For the avoidance of doubt, do we all agree that the Python solution will be the only solution submitted (at least for now)?

I think so -- and of improve the DataFusion python bindings in the process so much the better 👍

Sounds good to me as well. We can always add the native Rust version later.

@bkmgit
Copy link
Contributor

bkmgit commented Dec 28, 2021

It would be great to compare the different language bindings. What obstacles are there to submitting Rust?

@matthewmturner
Copy link
Contributor

It would be great to compare the different language bindings. What obstacles are there to submitting Rust?

nothing too big.

  1. creating a utility that writes the required logging output
  2. coordination with h2oai team to get rust / cargo setup. its not clear to me how much has been done based on @Dandandan initial request to get datafusion added.

personally, i agree it would be great to add rust bindings. but i think starting with python and adding rust as a second step would still be a good improvement.

@realno
Copy link
Contributor

realno commented Jan 4, 2022

I am really interested in the result and thank you @matthewmturner for the work! For some context, I am in the process of proposing and planning for the next-gen analytics platform for my org to improve the performance and scalability - DataFusion caught my eyes in early research. This benchmark will provide important information for decision making and I am happy to help.

One small suggestion - if would be really nice if you can make the test script available in the repo so we can analyze the queries and run/debug them locally. I am new to the project so if it is already available somewhere I apologize and please give some pointers. Thanks!

@matthewmturner
Copy link
Contributor

@realno thanks for the context.

you can find the PR that i am working on here (h2oai/db-benchmark#240)

you will need to download the data (directions here https://github.com/h2oai/db-benchmark#single-solution-benchmark) and add them to a data directory in the repo.

within the PR i have you will see the scripts datafusion/groupby-datafusion.py and join-datafusion.py.

to run the benchmarks you can do the following (of course youll have to install datafusion with pip):

groupby

SRC_DATANAME=G1_1e7_1e2_0_0 python datafusion/groupby-datafusion.py

join

SRC_DATANAME=J1_1e7_NA_0_0 python datafusion/join-datafusion.py

hope this helps! let me know if any other questions.

@matthewmturner
Copy link
Contributor

As an update here the maintainer of the DB-Benchmark repo has left H2O-AI so directed me to reach out to their support team for assistance. I have raised a ticket and will provide update here as it comes.

@matthewmturner
Copy link
Contributor

Cross post from slack:

I’m working on updating datafusions db-benchmark results based on datafusion v7. i just got a first cut of the results compared to what i produced a couple months ago. i was planning on finalizing the analysis before sharing but i wanted to provide a preview as i may not have time to finish for a day or two. this was produced using datafusion-python on an M1 Macbook.

on December 27th we were at the below for group by:

0.11225258399999993 # q1
0.695109333 # q2
2.932470125 # q3
0.07341450000000016 # q4
3.3075385419999996 # q5
2.9051008750000005 # q7
4.573697916 # q8
68.875322208 # q10

based on datafusion version 7:

q1: 0.03743266599999995
q2: 0.4997687500000001
q3: 2.119365208
q4: 0.034825500000000176
q5: 2.144292417
q7: 2.0165450419999997
q8: 2.9783209999999993
q10: 47.229685542

We’ve seen pretty good performance increases across the board based on the latest release. Compared to currently published db-benchmark that would put datafusion as the fastest / tied for faster on groupby queries Q1 and Q4. In general, we had similar results to spark.

For join in december we had:

q1 took 261 ms
q2 took 367 ms
q3 took 334 ms
q4 took 507 ms
q5 took 1936 ms

and now we are at:

q1: 0.5796001249999999
q2: 0.4178434580000001
q3: 0.4701954159999999
q4: 0.4357888750000001
q5: 1.8161980410000003

we have lost some performance on the join side, im not sure why, but compared to other engines we are still doing very well, with basically the best performance across the board.

Please take these results as preliminary…im still working through things.

Im going to work on adding the missing group by queries now with the latest v7 functionality. i also was thinking of contributing a script that would run the whole db-benchmark process so that anyone could use run db-benchmark as needed.

@matthewmturner
Copy link
Contributor

Another cross post from slack

This is the latest with the new groupby queries (6 and 9) included:

q1: 0.038001750000000056
q2: 0.46213770899999995
q3: 2.206588334
q4: 0.03716179199999958
q5: 2.2481447910000005
q6: 2.099691 NEW
q7: 1.9977297499999995
q8: 3.0949106670000006
q9: 2.20049575 NEW
q10: 49.882744625

Only about half(5) of the engines benchmarked even complete these so that already puts us in a pretty good spot. However, of those that do complete it we are on the slower side - about tied with the current slowest.
however as @Dandandan noted this is without some optimizations in place. im going to work on adding those next.

also im going to work on building an automation script so anyone can run this benchmark themselves and play with optimizations they have in mind.

@matthewmturner
Copy link
Contributor

@realno FYI see above for latest - as i know you expressed a specific interest in this.

@realno
Copy link
Contributor

realno commented Feb 23, 2022

@realno FYI see above for latest - as i know you expressed a specific interest in this.

I am actually reading this and the slack channel, lol :D Thanks @matthewmturner for the update and progress! I will definitely participate in optimization as much as I can. This is a great start!

@matthewmturner
Copy link
Contributor

@realno oh and I had an issue with the approx_median function. I ended up having to use the quantile function instead. I didn't really get the chance to look into it though.

@realno
Copy link
Contributor

realno commented Feb 23, 2022

@realno oh and I had an issue with the approx_median function. I ended up having to use the quantile function instead. I didn't really get the chance to look into it though.

I will look into it. If you can gather some info and create an issue please tag me on it. Thanks!

@alamb
Copy link
Contributor Author

alamb commented Mar 7, 2023

I think this issue is now complete

@alamb alamb closed this as completed Mar 7, 2023
mustafasrepo added a commit that referenced this issue Dec 25, 2023
* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* Initial comments on working

* License update (#157)

This extension adds Synnada license information to the existing one.

* Adding comments

* Update sort_hash_join.rs

* After merge silent error

* Change the query in HashJoin

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Update rat_exclude_files.txt

* Clippy solving.

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* After merge

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Before clippy fmt etc.

* lazy loading tables

* mini test

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes

* Before rebase merge

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Update utils.rs

* Print deletion

* Update Cargo.lock

* Refactor for review

* Working without slt

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Change in test folders

* Update join_pipeline_selection.rs

* Update utils.rs

* Before clippy

* Before SLT

* Tests are passing and clippy OK.

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Resolve errors introduced during rebase

* After merge

* Update rat_exclude_files.txt

* Comments visited

* Synnada Streaming SQL Tests (#190)

* Adds a new method to construct window function for the given input

* For mustafa

* Final

* Update rat_exclude_files.txt

* More commenting

* Fix linter errors, compile errors after rebase, Update commit hashes

* After merge refactors

* Dir

* Additional test for coverage

* Update join_disable_repartition_joins.slt

* Review changes, remove code duplicates

* Update subdirectory hashes

---------

Co-authored-by: Berkay Şahin <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
mustafasrepo added a commit that referenced this issue Dec 25, 2023
* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Resolve errors introduced during rebase

* Synnada Streaming SQL Tests (#190)

* Adds a new method to construct window function for the given input

* Protobuf implementations with roundrobin tests (#193)

* Protobuf implementations with roundrobin

* Proto

* Update mod.rs

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Resolve errors introduced during rebase

* Synnada Streaming SQL Tests (#190)

* Adds a new method to construct window function for the given input

* Protobuf implementations with roundrobin tests (#193)

* Protobuf implementations with roundrobin

* Proto

* Update mod.rs

* Fix linter errors, compile errors after rebase, Update commit hashes, regenerate proto

* Rewrite Filter Predicate (#192)

* Global join selection (#183)

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* Initial comments on working

* License update (#157)

This extension adds Synnada license information to the existing one.

* Adding comments

* Update sort_hash_join.rs

* After merge silent error

* Change the query in HashJoin

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Update rat_exclude_files.txt

* Clippy solving.

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* After merge

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Before clippy fmt etc.

* lazy loading tables

* mini test

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes

* Before rebase merge

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Update utils.rs

* Print deletion

* Update Cargo.lock

* Refactor for review

* Working without slt

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Change in test folders

* Update join_pipeline_selection.rs

* Update utils.rs

* Before clippy

* Before SLT

* Tests are passing and clippy OK.

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Resolve errors introduced during rebase

* After merge

* Update rat_exclude_files.txt

* Comments visited

* Synnada Streaming SQL Tests (#190)

* Adds a new method to construct window function for the given input

* For mustafa

* Final

* Update rat_exclude_files.txt

* More commenting

* Fix linter errors, compile errors after rebase, Update commit hashes

* After merge refactors

* Dir

* Additional test for coverage

* Update join_disable_repartition_joins.slt

* Review changes, remove code duplicates

* Update subdirectory hashes

---------

Co-authored-by: Berkay Şahin <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>

* work in progress

* fix after merge

* joins are ok

* code cleaning

* CaseExpr handling

* tests are updated

* Handling sequential projections

* Simplifications

* partition expr update

* ProjectionPushdown becomes a rule.bash de

* fix after merge

* remove the subrule

* tpch update

* Minor comment changes

* remove unnecessary state struct

* coalesce batches does not let to push projection down

* tpch changes removed

* minor changes

* addresses reviews

* Update projection_pushdown.rs

* minor

* Review Part 1

* Simplify the API of plan handlers

* Review Part 2

* Review Part 3

* Review Part 4

* Review Part 5

* Review Part 6

* Review Part 7

* Remove duplication of physical_expr matching

* fix documentation

* Minor changes

* Take upstream changes

---------

Co-authored-by: Metehan Yıldırım <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
mustafasrepo added a commit that referenced this issue Dec 25, 2023
* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* Initial comments on working

* License update (#157)

This extension adds Synnada license information to the existing one.

* Adding comments

* Update sort_hash_join.rs

* After merge silent error

* Change the query in HashJoin

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Update rat_exclude_files.txt

* Clippy solving.

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* After merge

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Before clippy fmt etc.

* lazy loading tables

* mini test

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes

* Before rebase merge

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Update utils.rs

* Print deletion

* Update Cargo.lock

* Refactor for review

* Working without slt

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Change in test folders

* Update join_pipeline_selection.rs

* Update utils.rs

* Before clippy

* Before SLT

* Tests are passing and clippy OK.

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Resolve errors introduced during rebase

* After merge

* Update rat_exclude_files.txt

* Comments visited

* Synnada Streaming SQL Tests (#190)

* Adds a new method to construct window function for the given input

* For mustafa

* Final

* Update rat_exclude_files.txt

* More commenting

* Fix linter errors, compile errors after rebase, Update commit hashes

* After merge refactors

* Dir

* Additional test for coverage

* Update join_disable_repartition_joins.slt

* Review changes, remove code duplicates

* Update subdirectory hashes

---------

Co-authored-by: Berkay Şahin <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
mustafasrepo added a commit that referenced this issue Dec 25, 2023
* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Resolve errors introduced during rebase

* Synnada Streaming SQL Tests (#190)

* Adds a new method to construct window function for the given input

* Protobuf implementations with roundrobin tests (#193)

* Protobuf implementations with roundrobin

* Proto

* Update mod.rs

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Resolve errors introduced during rebase

* Synnada Streaming SQL Tests (#190)

* Adds a new method to construct window function for the given input

* Protobuf implementations with roundrobin tests (#193)

* Protobuf implementations with roundrobin

* Proto

* Update mod.rs

* Fix linter errors, compile errors after rebase, Update commit hashes, regenerate proto

* Rewrite Filter Predicate (#192)

* Global join selection (#183)

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* Initial comments on working

* License update (#157)

This extension adds Synnada license information to the existing one.

* Adding comments

* Update sort_hash_join.rs

* After merge silent error

* Change the query in HashJoin

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Update rat_exclude_files.txt

* Clippy solving.

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* After merge

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Before clippy fmt etc.

* lazy loading tables

* mini test

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes

* Before rebase merge

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Update utils.rs

* Print deletion

* Update Cargo.lock

* Refactor for review

* Working without slt

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Change in test folders

* Update join_pipeline_selection.rs

* Update utils.rs

* Before clippy

* Before SLT

* Tests are passing and clippy OK.

* [GITHUB ACTION] Refactor for license and actions (#148)

* Delete datafusion main publication

* Adding licence information, refactoring prunibility issues

* Update SYNNADA-CONTRIBUTIONS.txt

* Update rat_exclude_files.txt

* Enhanced Pipeline Execution: Now Supporting Complex Query Plans for Improved Performance (#132)

* Very initial test passing algorithm

* Working except a minor bug in interval calculations

* After clippy

* Plan

* initial implemantation

* Before prune check ability is added.

Order equivalence implementations will vanish after we send a seperate PR

* minor changes

* Fix bug, ordering equivalence random head

* minor changes

* Add ordering equivalence for sort merge join

* Improvement on tests

* Upstream changes

* Add ordering equivalence for sort merge join

* Fmt issues

* Update comment

* Add ordering equivalence support for hash join

* Make 1 file

* Code enhancements/comment improvements

* Add projection cast handling

* Fix output ordering for sort merge join

* projection bug fix

* Minor changes

* minor changes

* simplify sort_merge_join

* Update equivalence implementation

* Update test_utils.rs

* Update cast implementation

* More idiomatic code

* After merge

* Comments visisted

* Add key swap according to the children orders

* Refactoring

* After merge refactor

* Update sort_enforcement.rs

* Update datafusion/core/src/physical_optimizer/join_selection.rs

Co-authored-by: Mustafa Akur <[email protected]>

* Comments are applied

* Feature/determine prunability (#139)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

* Determine prunability of tables for join operations (#90)

* ready to review

* license added

* simplifications

* simplifications

* sort expr's are taken separately for each table

* we can return the sort info of the expression now

* check filter conditions

* simplifications

* simplifications

* functions are implemented for SortInfo calculations

* node specialized tableSide functions

* NotImplemented errors are added, test comments are added

* Comment change

* Simplify comparison node calculations

* Simplfications and better commenting

* is_prunable function is updated with new Prunability function

* Indices of sort expressions are updated with intermediate schema columns of the filter

* Unused function is removed

* Future-proof index updating

* An if let check is removed

* simplifications

* Simplifications

* simplifications

* Change if condition

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* fix the tables' unboundedness

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Comment improvements and minor code improvements

* Splitting the order based join selection

* Update rat_exclude_files.txt

* Revert "Feature/determine prunability (#139)"

This reverts commit cf56105.

* Commented

---------

Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

* Bug fix: Fix lexicographical column search among provided ordering (#156)

* License update (#157)

This extension adds Synnada license information to the existing one.

* Sliding Nested Join Algorithm (#142)

* Sliding Hash Join Algorithm (SWHJ) (#147)

* Fix errors introduced during rebase

* Keep Track of Global Ordering Requirement (#165)

* Prunability of Join Filter Physical Expressions (#161)

* BinaryExpr Equivalence (#116)

* Fix errors introduced during rebase

* Support multiple ordered columns on joins and expression graph (#163)

* SlidingHashJoin and SlidingNestedLoopJoin planner integration (#171)

* Add license, add contribution hash commits, minor changes, add scripts to automate hash generation,Delete docs yaml file

* Resolve errors introduced during rebase

* After merge

* Update rat_exclude_files.txt

* Comments visited

* Synnada Streaming SQL Tests (#190)

* Adds a new method to construct window function for the given input

* For mustafa

* Final

* Update rat_exclude_files.txt

* More commenting

* Fix linter errors, compile errors after rebase, Update commit hashes

* After merge refactors

* Dir

* Additional test for coverage

* Update join_disable_repartition_joins.slt

* Review changes, remove code duplicates

* Update subdirectory hashes

---------

Co-authored-by: Berkay Şahin <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>

* work in progress

* fix after merge

* joins are ok

* code cleaning

* CaseExpr handling

* tests are updated

* Handling sequential projections

* Simplifications

* partition expr update

* ProjectionPushdown becomes a rule.bash de

* fix after merge

* remove the subrule

* tpch update

* Minor comment changes

* remove unnecessary state struct

* coalesce batches does not let to push projection down

* tpch changes removed

* minor changes

* addresses reviews

* Update projection_pushdown.rs

* minor

* Review Part 1

* Simplify the API of plan handlers

* Review Part 2

* Review Part 3

* Review Part 4

* Review Part 5

* Review Part 6

* Review Part 7

* Remove duplication of physical_expr matching

* fix documentation

* Minor changes

* Take upstream changes

---------

Co-authored-by: Metehan Yıldırım <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Mustafa Akur <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datafusion Changes in the datafusion crate help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

9 participants