Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branch 43 downgraded tonic #5

Open
wants to merge 1,513 commits into
base: main
Choose a base branch
from
Open

Conversation

matthewmturner
Copy link

Which issue does this PR close?

Closes #.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

alamb and others added 30 commits October 6, 2024 11:33
* Fix stack overflow calculating projected orderings

* fix docs
* Update to arrow/parquet 53.1.0

* Update some API

* update for changed file sizes

* Use non deprecated APIs

* Use ParquetMetadataReader from @etseidl

* remove upstreamed implementation

* Update CSV schema

* Use upstream is_null and is_not_null kernels
* Add support for serializing and deserializing Substrait ExtendedExpr message

* Address clippy reviews

* Reuse existing rename method
…ache#12571)

* Fix grouping sets behavior when data contains nulls

* PR suggestion comment

* Update new test case

* Add grouping_id to the logical plan

* Add doc comment next to INTERNAL_GROUPING_ID

* Fix unparsing of Aggregate with grouping sets

---------

Co-authored-by: Andrew Lamb <[email protected]>
….md to code (apache#12775)

* Added documentation for string and unicode functions.

* Fixed issues with aliases.

* Cargo fmt.

* Minor doc fixes.

* Update docs for var_pop/samp

---------

Co-authored-by: Andrew Lamb <[email protected]>
…integer (apache#12751)

* fix sig

Signed-off-by: jayzhan211 <[email protected]>

* fix

Signed-off-by: jayzhan211 <[email protected]>

* fix error

Signed-off-by: jayzhan211 <[email protected]>

* fix all signature

Signed-off-by: jayzhan211 <[email protected]>

* fix all signature

Signed-off-by: jayzhan211 <[email protected]>

* change default type

Signed-off-by: jayzhan211 <[email protected]>

* clippy

Signed-off-by: jayzhan211 <[email protected]>

* fix docs

Signed-off-by: jayzhan211 <[email protected]>

* rm deadcode

Signed-off-by: jayzhan211 <[email protected]>

* cleanup

Signed-off-by: jayzhan211 <[email protected]>

* cleanup

Signed-off-by: jayzhan211 <[email protected]>

* rm test

Signed-off-by: jayzhan211 <[email protected]>

---------

Signed-off-by: jayzhan211 <[email protected]>
…he#12745)

* remove redundant aggregate documentation

* remove redundant window documentation

* remove rudundant scalar functions
* Improve documentation, make DependencyMap / Dependencies a real struct + fix stack overflow

* Update datafusion/physical-expr/src/equivalence/properties.rs

Co-authored-by: Berkay Şahin <[email protected]>

---------

Co-authored-by: Berkay Şahin <[email protected]>
* API to go from `ParquetExec` to `ParquetExecBuilder`

* fix potential regression

* Apply suggestions from code review

Co-authored-by: Nga Tran <[email protected]>

* add note about fields being re-created

---------

Co-authored-by: Nga Tran <[email protected]>
* Minor: add documentation note about `NullState`

* Remove unecessary copy/paste license

* Update datafusion/expr-common/src/groups_accumulator.rs
…zer crate (apache#12783)

* move test from core to optimizer crate

Signed-off-by: jayzhan211 <[email protected]>

* cleanup

Signed-off-by: jayzhan211 <[email protected]>

* upd

Signed-off-by: jayzhan211 <[email protected]>

* clippy

Signed-off-by: jayzhan211 <[email protected]>

* fmt

Signed-off-by: jayzhan211 <[email protected]>

---------

Signed-off-by: jayzhan211 <[email protected]>
…pache#12825)

Bumps [cookie](https://github.com/jshttp/cookie) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together.

Updates `cookie` from 0.6.0 to 0.7.1
- [Release notes](https://github.com/jshttp/cookie/releases)
- [Commits](jshttp/cookie@v0.6.0...v0.7.1)

Updates `express` from 4.21.0 to 4.21.1
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/4.21.1/History.md)
- [Commits](expressjs/express@4.21.0...4.21.1)

---
updated-dependencies:
- dependency-name: cookie
  dependency-type: indirect
- dependency-name: express
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Remove unused dependencies and features

* Update Cargo.lock

* Remove regex and base64
* impl primitive arrays generator.

* sort out the test record batch generating codes.

* draft for `DataSetsGenerator`.

* tmp

* improve the data generator, and start to impl the session context generator.

* impl context generator.

* tmp

* define the `AggregationFuzzer`.

* add ut for data generator.

* improve comments for `SessionContextGenerator`.

* define `GeneratedSessionContextBuilder` to reduce repeated codes.

* extract the check equality logic for reusing.

* add ut for `SessionContextGenerator`.

* tmp

* finish the main logic of `AggregationFuzzer`.

* try to rewrite some test using the fuzzer.

* fix header.

* expose table name through `AggregationFuzzerBuilder`.

* throw err to aggr fuzzer, and expect them then.

* switch to Arc<str> to slightly improve performance.

* throw more errors to fuzzer.

* print task informantion before panic.

* improve comments.

* support printing generated session context params in error reporting.

* add todo.

* add some new fuzz case based on `AggregationFuzzer`.

* fix lint.

* print more information in error report.

* fix clippy.

* improve comment of `SessionContextGenerator`.

* just use fixed `data_gen_rounds` and `ctx_gen_rounds` currently, because we will hardly set them.

* improve comments for rounds constants.

* small improvements.

* select sql from some candidates ranther than fixed one.

* make `data_gen_rounds` able to set again, and add more tests.

* add no group cases.

* add fuzz test for basic string aggr.

* make `data_gen_rounds` smaller.

* add comments.

* fix typo.

* fix comment.
…pache#12804)

* Patched from `lead-lag` conversion tree

* Fixes unit tests in `row_number` udwf

* Add doc comments

* Updates doc comment

* Updates API to expose `input_exprs` directly

* Updates API to returns data types of input expressions
`Setup rust toolchain` build step was observed to be flaky. Retries may
help.
…int[3]`) (apache#12810)

* add arm to process array def with square bracket as fixedsizelist

* add tests
* Make HashJoinExec::join_schema public

It is needed by physical optimizers that want to replace the HashJoin with a different
type of join, as they need to replace it with an equivalent projection, but
HashJoinExec::projection could not be used to build it because it refers to
indices in HashJoinExec::join_schema.

* Replace it with an accessor
adriangb and others added 25 commits November 1, 2024 11:23
* implement target type selection for range queries on dictionary data types

Fixes apache#13151

* Update type_coercion.rs

* Add test

* query I?
…re string tests (apache#13197)

* empty

* Allow testing values with trailing whitespace in SLT tests

* Update SLT tests for "Allow testing values with trailing whitespace ..."

* Add empty string to string test data
…he#13079)

* Use single file write when an extension is present in the path.

* Adjust formatting.

* Remove unneeded return statement..
* Minor: make `Expr::volatile` infallible

* cmt
…che#13127)

* feat(substrait): handle emit_kind when consuming Substrait plans

* cargo fmt

* avoid projection flattening for volatile expressions

* simplify application of apply_emit_kind
* fix: date_bin() on timstamps before 1970

The date_bin() function was not working correctly for timestamps before
1970. Specifically if the input timestamp was the exact time of the
start of a bin then it would be placed in the previous bin.

The % operator has a negative result when the dividend is negative.
This causes the date_bin calculation to round up to the next bin. To
compensate the size of 1 interval is subtracted from the result if the
input is negative. This subtraction is no longer performed if the input
is already the exact time of the start of a bin.

* fix clippy

---------

Co-authored-by: Andrew Lamb <[email protected]>
…he#13179)

`invoke_batch` is the one used now. The others are no longer in use and
we should deprecate and remove them.
* consider volatile function in simply_expression

* refactor and fix bugs

* fix clippy

* refactor

* refactor

* format

* fix clippy

* Resolve logical conflict

* simplify more

---------

Co-authored-by: Andrew Lamb <[email protected]>
* Conversion types for LexOrdering and LexOrderingRef to structs.

* Format and fix type errors. Adjusted expected output when using `LexOrdering`.

* Updated usage of `FromIterator` and removed `empty()` in favor of `default()`.

* Adjusted chained `map` and `flatten` calls to `flat_map`, and swapped `unwrap_or` to `unwrap_or_default`.

* Adjusted slt files to include a space after commas, when relating to LexOrdering and LexOrderingRef.

* Removed unnecessary path prefixes in `sort_expr`.

* Fixed tpch slt files.

* Removed LexOrderingRef struct.

* Removed dereferences to `LexOrderingRef` left over from the struct removal.

* Removed remaining usage of the raw `LexOrderingRef` type.

* Formatting.

* Apply suggestions from code review, along with formatting.

* Merged with main.

* Merged with main.

---------

Co-authored-by: nglime <[email protected]>
* array_resize null fix

* comment

* clippy

* fixes
* Derive `Clone` for more ExecutionPlans

* improve docs
* [logical-types] add NativeType and LogicalType

* Add license header

* Add NativeField and derivates

* Support TypeSignatures

* Fix doc

* Add documentation

* Fix doc tests

* Remove dummy test

* From NativeField to LogicalField

* Add default_cast_for

* Add type order with can_cast_types

* Rename NativeType Utf8 to String

* NativeType from &DataType

* Add builtin types

* From LazyLock to OnceLock
* Apply projection to `Statistics` in `FilterExec`

* Use Statistics::project in HashJoin
…er (apache#13130)

* overloaded from ts

* Update docs/source/user-guide/sql/scalar_functions_new.md

Co-authored-by: Bruce Ritchie <[email protected]>

* fixed return type

* added sql example

* optional in ∂oc

* review

---------

Co-authored-by: Bruce Ritchie <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
…13174)

* Deprecate invoke and invoke_no_args in favor of invoke_batch

`invoke_batch` covers all needs, so let's deprecate and eventually
remove the redundant variants.

* Migrate test_function to invoke_batch

* Migrate regexpcount tests to invoke_batch

* Migrate log tests to invoke_batch

* Migrate tests to use invoke_batch

* Migrate ToUnixtimeFunc to implement invoke_batch

* Suppress deprecation warnings in tests

To be followed-up on.

* Migrate random benchmark to invoke_batch

* fixup! Suppress deprecation warnings in tests

* Fix docstring
* Remove deprecated transform functions

They were deprecated since 38.0.0, which was released 6 months ago.

* Remove deprecated and unused FileSinkExec type

It was deprecated since 38.0.0, which was released 6 months ago.
akurmustafa and others added 3 commits December 27, 2024 18:10
* Initial commit

* Fix formatting

* Add across partitions check

* Add new test case

Add a new test case

* Fix buggy test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.