-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add ProofPlan::get_table_references
#323
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
JayWhite2357
changed the title
feat: add ProofPlan::get_table_references
feat: add Oct 27, 2024
ProofPlan::get_table_references
iajoiner
approved these changes
Oct 27, 2024
JayWhite2357
force-pushed
the
feat/get-table-references
branch
from
October 27, 2024 03:16
b6ef1df
to
464f3fd
Compare
🎉 This PR is included in version 0.34.1 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
yarbroughrh
added a commit
that referenced
this pull request
Oct 29, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [ ] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [ ] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes: Parsing the query: SELECT COUNT(*) AS total_books FROM books... Done in 5.759542000000001 ms. Generating proof...Done in 258.379792 ms. Verifying proof...Verified in 82.497291 ms. Query Result: Ok(OwnedTable { table: {Identifier { name: "total_books" }: BigInt([10])} }) Parsing the query: SELECT title, author FROM books WHERE rating > 4.5... Done in 8.903875000000001 ms. Generating proof...Done in 350.96833399999997 ms. Verifying proof...Verified in 103.658958 ms. Query Result: Ok(OwnedTable { table: {Identifier { name: "title" }: VarChar(["Clean Code", "The Clean Coder", "Design Patterns", "Effective Java", "Introduction to Algorithms", "Code Complete"]), Identifier { name: "author" }: VarChar(["Robert C. Martin", "Robert C. Martin", "Erich Gamma", "Joshua Bloch", "Thomas H. Cormen", "Steve McConnell"])} }) Parsing the query: SELECT title, publication_year FROM books WHERE genre = 'Programming' AND publication_year > 2000... Done in 5.648333 ms. Generating proof...Done in 257.21125 ms. Verifying proof...Verified in 111.860208 ms. Query Result: Ok(OwnedTable { table: {Identifier { name: "title" }: VarChar(["Clean Code", "The Clean Coder", "Effective Java", "Code Complete"]), Identifier { name: "publication_year" }: BigInt([2008, 2011, 2008, 2004])} }) Parsing the query: SELECT author, COUNT(*) AS book_count FROM books GROUP BY author ORDER BY book_count DESC LIMIT 5... Done in 5.722208999999999 ms. Generating proof...Done in 181.42775 ms. Verifying proof...Verified in 79.493166 ms. Query Result: Ok(OwnedTable { table: {Identifier { name: "author" }: VarChar(["Robert C. Martin", "Andrew Hunt", "Erich Gamma", "Fred Brooks", "Joshua Bloch"]), Identifier { name: "book_count" }: BigInt([2, 1, 1, 1, 1])} })
Dustin-Ray
added a commit
that referenced
this pull request
Oct 29, 2024
# Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> - More test example in the repo provide better documentation # What changes are included in this PR? <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> - stock market dataset - script to run the test # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes
Dustin-Ray
added a commit
that referenced
this pull request
Oct 29, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. -->
varshith257
pushed a commit
to varshith257/sxt-proof-of-sql
that referenced
this pull request
Oct 30, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [ ] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [ ] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes spaceandtimelabs#345. Since we added `HashJoinExec` in spaceandtimelabs#323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. -->
iajoiner
added a commit
that referenced
this pull request
Oct 31, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change We need to add `ColumnarValue` so that we can remove `table_length` from `ProofExpr::result_evaluate` which helps with `ProofPlan` composition. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - add `base::database::ColumnarValue ` <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes
iajoiner
added a commit
that referenced
this pull request
Nov 4, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [ ] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [ ] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change To prepare to remove `table_length` from `ProofExpr::result_evaluate` and return `ColumnarValue` we need to add operations on `ColumnarValue`. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - add comparison operations on `ColumnarValue`. - add numerical operations on `ColumnarValue`. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Will be tested in a future PR.
tareknaser
pushed a commit
to tareknaser/sxt-proof-of-sql
that referenced
this pull request
Nov 4, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change `Indexes` is not useful since we only use dense indexes from 0 to output table length anyway. Now it is time to remove it. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes spaceandtimelabs#345. Since we added `HashJoinExec` in spaceandtimelabs#323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - remove `Indexes` <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes
tareknaser
pushed a commit
to tareknaser/sxt-proof-of-sql
that referenced
this pull request
Nov 4, 2024
…er` to `FinalRoundBuilder` && split first round proof out of `ProofPlan::result_evaluate` (spaceandtimelabs#257) Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change `ResultBuilder` isn't really needed for `result_evaluate` when all we need are input table lengths. However it is crucial in producing the query proof. Hence we need to remove it from the actual query result computation process (`expr.result_evaluate`). At the same time it should be correctly named as builder of the first round of the proof because that's what it is. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes spaceandtimelabs#345. Since we added `HashJoinExec` in spaceandtimelabs#323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? The 4 commits are supposed to be reviewed individually. - rename `ResultBuilder` to `FirstRoundBuilder`, `ProofBuilder` to `FinalRoundBuilder`. - rename `proof-of-sql/src/sql/proof/result_builder.rs` to `first_round_builder.rs`, `proof-of-sql/src/sql/proof/proof_builder.rs` to `final_round_builder.rs`. - split out post result challenges into `ProofPlan::first_round_evaluate` - replace `FirstRoundBuilder` in `result_evaluate` with `input_length` - rename `ProverEvaluate::prover_evaluate` to `final_round_evaluate` <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes
iajoiner
added a commit
that referenced
this pull request
Nov 6, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change The rustfmt config used in CI is tedious to type hence it is helpful to add it as an alias to `config.toml`. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - add `cargo f` and `cargo cl` aliases to `config.toml` - remove fmt customization from CI to unify it <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes
iajoiner
added a commit
that referenced
this pull request
Nov 7, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change `column_operation.rs` is incredibly long. It is time to split out `ColumnType` manipulations as a separate file. Moreover since we will most likely have a separate file for `Column` operations and that we don't even use either `OwnedColumn` or `Column` in `column_operation.rs` it is a misnomer and should be renamed `slice_operation.rs`. Decimal-related code is long, has unusual dependencies and can't easily be simplified hence they should be split into a separate file too. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - Move `ColumnType`-related code to `column_type_operation.rs` - Move decimal-related slice manipulation code to `slice_decimal_operation.rs` - Rename the file to `slice_operation.rs` <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Existing tests should pass
iajoiner
added a commit
that referenced
this pull request
Nov 7, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. `column_operation.rs` is incredibly long. It is time to split out `ColumnType` manipulations as a separate file. Moreover since we will most likely have a separate file for `Column` operations and that we don't even use either `OwnedColumn` or `Column` in `column_operation.rs` it is a misnomer and should be renamed `slice_operation.rs`. Decimal-related code is long, has unusual dependencies and can't easily be simplified hence they should be split into a separate file too. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> - Move `ColumnType`-related code to `column_type_operation.rs` - Move decimal-related slice manipulation code to `slice_decimal_operation.rs` - Rename the file to `slice_operation.rs` <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Existing tests should pass
iajoiner
added a commit
that referenced
this pull request
Nov 7, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change In order to allow #235 to be done in time for JOIN-related integrations we need to get `proof-of-sql-parser` -> `sqlparser` adaptions done. Large parts of the work going forward can then become more manageable. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - add `sqlparser.rs` with adaptations <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? Yes <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. -->
iajoiner
added a commit
that referenced
this pull request
Nov 8, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change We want to remove the arg since it has never been used and because it benefits `ProofPlan` composition. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? See title. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Existing tests should pass
iajoiner
added a commit
that referenced
this pull request
Nov 11, 2024
…ve `get_offset` and `get_length` from `ProofPlan` (#361) Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change We need to allow composable `ProofPlan`s and multiple table queries. Hence it no longer makes sense to have every `ProofPlan` have its own `get_offset` and `get_length` for input length. There can be multiple input lengths for a `ProofPlan` and offsets only make sense for source tables (that is, `TableExec`) but not intermediate `ProofPlan`s. Hence we have to refactor `query_proof.rs` to remove any reference to a single input length and a single offset. For proofs purposes what we actually care about are actually the smallest row index (that is, the smallest offset) among all input tables, the largest row index among all input tables and the length of the longest intermediate table (that is, the largest row index among all intermediate tables if we start from 0). The last value will remain less than or equal to the second one until we add unions, joins and other `ProofPlan`s that actually make output tables longer than input ones. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - add `indexset` macro (thanks @JayWhite2357 for putting it in a review comment in #346) - add `get_index_range` to compute the index range - refactor `QueryProof` to use the index range as opposed to a single `input_length` / `offset_generators` - remove `get_offset` and `get_length` from `ProofPlan` which are no longer used <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Existing tests do pass
iajoiner
added a commit
that referenced
this pull request
Nov 11, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change This change is made to simplify column operations partly in order to simplify `owned_column_operation.rs` and partly in preparation for the new version of provable arithmetic expressions. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - replace `BinaryOp` and `UnaryOp` in `ColumnOperationError` with strings <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Existing tests should pass
iajoiner
added a commit
that referenced
this pull request
Nov 12, 2024
…nused functions from `slice_operation.rs` (#359) Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change `owned_column_operation.rs` is extremely tedious hence we need to simplify it. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - split out arithmetic operations into `column_arithmetic_operation.rs` and unify them. - split out comparison operations into `column_comparison_operation.rs` and unify them. - remove unused functions. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes.
iajoiner
added a commit
that referenced
this pull request
Nov 12, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [ ] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [ ] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change In order to make `ProofPlan`s composable it is helpful to add `Table` which will replace `DataAccessor` in `ProofExpr`. In order to add tests that aren't too tedious it is necessary to add table utilities analogous to owned table ones. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - add `Table` - add table utilities in `table_utility.rs` <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes.
iajoiner
added a commit
that referenced
this pull request
Nov 12, 2024
…parser::ast::UnaryOp` in the proof-of-sql crate (#363) Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change This PR addresses the need to replace the `proof_of_sql_parser::intermediate_ast::UnaryOp` with the `sqlparser::ast::UnaryOp` in the `proof-of-sql` crate as part of a larger transition toward integrating the `sqlparser` . This change is a subtask of issue #235, with the main goal of streamlining the repository by switching to the `sqlparser` crate and gradually replacing intermediary constructs like `proof_of_sql_parser::intermediate_ast` with `sqlparser::ast`. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - All instances of `proof_of_sql_parser::intermediate_ast::UnaryOp` have been replaced with `sqlparser::ast::UnaryOp` - Every usage of `UnaryOp` has been updated to maintain the original functionality, ensuring no changes to the logic or behavior. - Any unsupported `UnaryOp` variants from `sqlparser` have been appropriately handled using existing error handling mechanisms (i.e., the `Unsupported `variant in `ExpressionEvaluationError`). <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? Yes <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Part of #235
iajoiner
added a commit
that referenced
this pull request
Nov 12, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change Since I had auto-merge on NITs didn't get addressed. I promised to file a PR to resolve them today and hence I do now. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - remove `UnimplementedTestAccessor` which is no longer used - remove `ProofPlan::is_empty` which is only used in `VerifiableQueryResult` by inlining - make code more compact in `query_proof_test.rs` and `verifiable_query_result_test.rs` <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Existing tests should pass.
iajoiner
added a commit
that referenced
this pull request
Nov 12, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change In order to have generic operations on `ColumnarValue` it is necessary to have slice-literal operations. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - add slice-literal operations. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> N/A
iajoiner
added a commit
that referenced
this pull request
Nov 12, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change It is nice for people to be able to inspect serialized commitments and table metadata from a commitments file. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - add `commitment-utility` <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> It runs.
iajoiner
added a commit
that referenced
this pull request
Nov 13, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change `TableExec` is to be the most common source `ProofPlan` when all `ProofPlan`s become composable. Basically all it does is reading from a table. I plan to remove table reading functionalities from all other `ProofPlan`s soon. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - add `TableExec` <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> N/A
iajoiner
added a commit
that referenced
this pull request
Nov 17, 2024
…lparser::ast::BinaryOp` in the proof-of-sql crate (#362) Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change This PR addresses the need to replace the `proof_of_sql_parser::intermediate_ast::BinaryOp` with the `sqlparser::ast::BinaryOp` in the `proof-of-sql` crate as part of a larger transition toward integrating the `sqlparser` . This change is a subtask of issue #235, with the main goal of streamlining the repository by switching to the `sqlparser` crate and gradually replacing intermediary constructs like `proof_of_sql_parser::intermediate_ast` with `sqlparser::ast`. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - All instances of `proof_of_sql_parser::intermediate_ast::BinaryOp` have been replaced with `sqlparser::ast::BinaryOp` - Every usage of `BianryOp` has been updated to maintain the original functionality, ensuring no changes to the logic or behavior. - Any unsupported `BinaryOp` variants from `sqlparser` have been appropriately handled using existing error handling mechanisms (i.e., the `Unsupported `variant in `ExpressionEvaluationError`). <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes Closes #349 Part of #235
iajoiner
added a commit
that referenced
this pull request
Nov 18, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change This is the next step in making `ProofPlan` composable. Let's centralize fetching of table maps using `DataAccessor` to `query_proof.rs`. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? See above. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Existing tests do pass.
iajoiner
added a commit
that referenced
this pull request
Nov 19, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change We copied the code from `ProjectionExec` to `TableExec` and make the latter a special case of the former. This is wrong. There is no need to commit any column in `TableExec` since we already grab all the columns in `QueryProof`. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - Not commit columns in `TableExec`. - Make it clear that we don't have to select all cols. Following conventions in `DataFusion` [TableScan](https://docs.rs/datafusion/latest/datafusion/logical_expr/struct.TableScan.html) in logical planning and [FileScanConfig] in physical planning we can refuse to read columns we don't care about. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Existing tests should pass.
iajoiner
added a commit
that referenced
this pull request
Nov 22, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change Currently there are still places where `FinalRoundBuilder::table_length` is used in `ProofExpr` which is wrong when we start composing `ProofPlan`s. Moreover in cases where a `ProofPlan` causes its output table to be longer than any of the input tables (e.g. joins and unions) we need to bump the `max_row_num`. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - remove `table_length` from `FinalRoundBuilder` - add `range_length` to `FirstRoundBuilder` to track the max table length in a query (updates are NOT to be applied to any `ProofPlan` that can not cause its output table to have more rows than any of its inputs such as filter and projection) - update `range_length` after `FirstRoundBuilder` goes through the AST - store `min_row_num` and `max_row_num` in `QueryProof` for the verifier - remove `FirstRoundBuilder` from exclusively `result_evaluate` tests <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Existing tests should pass.
iajoiner
added a commit
that referenced
this pull request
Dec 2, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change When `ProofPlan`s compose it makes no sense to have `table_length` in `VerificationBuilder`. Instead it becomes necessary to record each and every instance of length used for one_eval and pass the info to the verifier. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - remove `VerificationBuilder::table_length` - generalize one eval from `input_length` and `output_length` to arbitrary lengths - add `one_evaluation_lengths` in `FinalRoundBuilder` to collect the lengths - add consumption of the lengths in `VerificationBuilder` <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes.
iajoiner
added a commit
that referenced
this pull request
Dec 3, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change We need to add this function in order to perform sort-merge joins. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? See title. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes.
iajoiner
added a commit
that referenced
this pull request
Dec 3, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change We need to make sure we use the correct table lengths in filter. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - make sure we use a col with 1s up to input length as opposed to an all 1 column in filter for `c_fold * c_star - input_ones = 0` constraint <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Existing tests do pass. Moreover slicing tests pass.
iajoiner
added a commit
that referenced
this pull request
Dec 4, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change Since we are adding range check which will require `FirstRoundBuilder` to read query results we partly undo #257 and merge `ProofPlan::result_evaluate` back into `ProofPlan::first_round_evaluate`. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? See above. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes.
iajoiner
added a commit
that referenced
this pull request
Dec 4, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. - The following upstream PRs have been merged: - [x] #381 - [x] #401 - [x] #404 # Rationale for this change This PR replaces #121 and is designed to test whether our `ProofPlan`s are truly composable now. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - add `SliceExec`. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes.
iajoiner
added a commit
that referenced
this pull request
Dec 4, 2024
…e` output to `FirstRoundBuilder` (#405) Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change We should have obvious correspondence between how one eval lengths are produced and consumed just like how we handle intermediate MLEs. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - move one eval lengths from `ProofPlan::first_round_builder` output to `FirstRoundBuilder` <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes.
iajoiner
added a commit
that referenced
this pull request
Dec 4, 2024
…op level restriction to `GroupByExec` itself (#407) Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change This PR is the equivalent of #401 for `GroupbyExec`. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - Replace blanket 1 columns with columns with 1 up to input length in `GroupByExec`. - Move `GroupByExec` composition restriction to `GroupByExec` itself. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Existing tests should pass.
iajoiner
added a commit
that referenced
this pull request
Dec 10, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [ ] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [ ] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change In `Bumpalo` if we allocate a non-empty slice and then an empty one they have the same positions hence it is necessary to differentiate between them by adding the slice length to the identifier. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? See above. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes.
iajoiner
added a commit
that referenced
this pull request
Dec 10, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change Recently @JayWhite2357 found a vulnerability in our `ProofPlan`s with nontrivial proofs related to range length. Now we are implementing a fix. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - add `range_length_one_evaluation` to `SumcheckMleEvaluations` - add `range_length` to `FinalRoundBuilder` - fix the bug in the filter proof. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Existing tests should pass.
iajoiner
added a commit
that referenced
this pull request
Dec 11, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. - The following upstream PRs have been approved and merged: - [x] #405 # Rationale for this change We need to add support for `UNION ALL`. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - add `UnionExec` - add some code to simplify debugging of sumchecks <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes.
iajoiner
added a commit
that referenced
this pull request
Dec 16, 2024
…sqlparser::ast::Ident` in the proof-of-sql crate (#382) Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change This PR addresses the need to replace the `proof_of_sql_parser::Identifier` with the `sqlparser::ast::Ident` in the `proof-of-sql` crate as part of a larger transition toward integrating the `sqlparser` . This change is a subtask of issue #235, with the main goal of streamlining the repository by switching to the `sqlparser` crate and gradually replacing intermediary constructs like `proof_of_sql_parser::intermediate_ast` with `sqlparser::ast`. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - All instances of `proof_of_sql_parser::Identifier` have been replaced with `sqlparser::ast::Ident` - A few of them required an identifier (e.g. Expression::Column, etc..), which is dependent on the Identifier and will be migrated at the refactoring of Exprs. - Every usage of `Identifier` has been updated to maintain the original functionality, ensuring no changes to the logic or behavior. - The breaking change here is that `Ident` doesn't support `Copy` trait so we have needed the clones in the places where values are moved - Deleted the test `we_cannot_convert_a_record_batch_if_it_has_repeated_column_names` because the `sqlparser` now differentiates between uppercase and lowercase identifiers. Case normalization is no longer applied and `sqlparser` treats `a` and `A` as distinct identifiers. - Examples are updated to align with `sqlparser`'s case-sensitive behavior. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? Yes <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Part of #235
iajoiner
added a commit
that referenced
this pull request
Dec 16, 2024
…437) Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change This change simplifies the codebase since `proof_of_sql::base::sqlparser::ident` only does one thing which is to call `From<&str>` on `Ident`. Let's simplify the codebase here. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? See title. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes.
iajoiner
added a commit
that referenced
this pull request
Dec 16, 2024
…`proof_gadgets` (#433) Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change Bitwise verification, sign checks, range checks, subset checks, uniqueness checks etc are building blocks of `ProofExpr`s and `ProofPlan`s but are not themselves either. Hence we would like to put them in a separate `proof_gadgets` directory. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? Move bitwise verification, sign checks & range checks to `proof_gadgets` <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Existing tests do pass
iajoiner
added a commit
that referenced
this pull request
Dec 18, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [ ] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [ ] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change To update docs to reflect the migration of Identifier -> Ident <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - Updated docs in the proof-of-sql crate with Ident <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Part of #235
iajoiner
added a commit
that referenced
this pull request
Dec 18, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. - The following upstream PRs have been approved and merged: - [x] #391 - [x] #396 # Rationale for this change This PR adds the actual sort-merge join process which completes a part of #394. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? - add `sort_merge_join`. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Yes.
iajoiner
added a commit
that referenced
this pull request
Dec 20, 2024
Please be sure to look over the pull request guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr. # Please go through the following checklist - [x] The PR title and commit messages adhere to guidelines here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md. In particular `!` is used if and only if at least one breaking change has been introduced. - [x] I have run the ci check script with `source scripts/run_ci_checks.sh`. # Rationale for this change For range check and joins we need to have pre-challenge committing of columns. As a result we need to add such functionalities to `FirstRoundBuilder`. <!-- Why are you proposing this change? If this is already explained clearly in the linked issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. Example: Add `NestedLoopJoinExec`. Closes #345. Since we added `HashJoinExec` in #323 it has been possible to do provable inner joins. However performance is not satisfactory in some cases. Hence we need to fix the problem by implement `NestedLoopJoinExec` and speed up the code for `HashJoinExec`. --> # What changes are included in this PR? See above. <!-- There is no need to duplicate the description in the ticket here but it is sometimes worth providing a summary of the individual changes in this PR. Example: - Add `NestedLoopJoinExec`. - Speed up `HashJoinExec`. - Route joins to `NestedLoopJoinExec` if the outer input is sufficiently small. --> # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? Example: Yes. --> Will be.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Rationale for this change
An upstream crates needs to know what tables are being queried against from the proof plan.
What changes are included in this PR?
ProofPlan::get_table_references
is addedAre these changes tested?
Yes