diff --git a/dev/changelog/27.0.0.md b/dev/changelog/27.0.0.md new file mode 100644 index 000000000000..305e238b8861 --- /dev/null +++ b/dev/changelog/27.0.0.md @@ -0,0 +1,203 @@ + + +## [27.0.0](https://github.com/apache/arrow-datafusion/tree/27.0.0) (2023-06-26) + +[Full Changelog](https://github.com/apache/arrow-datafusion/compare/26.0.0...27.0.0) + +**Breaking changes:** + +- Remove `avro_to_arrow::reader::Reader::next` in favor of `Iterator` implementation. [#6538](https://github.com/apache/arrow-datafusion/pull/6538) (LouisGariepy) +- Add support for appending data to external tables - CSV [#6526](https://github.com/apache/arrow-datafusion/pull/6526) (mustafasrepo) +- Move `physical_plan::file_format` to `datasource::plan` [#6516](https://github.com/apache/arrow-datafusion/pull/6516) (alamb) +- Remove `FromSlice` in favor of `From` impl in upstream arrow-rs code [#6587](https://github.com/apache/arrow-datafusion/pull/6587) (alamb) +- Improve main api doc page, move `avro_to_arrow` to `datasource` [#6564](https://github.com/apache/arrow-datafusion/pull/6564) (alamb) +- Fix Clippy module inception (unwrap `datasource::datasource` and `catalog::catalog` [#6640](https://github.com/apache/arrow-datafusion/pull/6640) (LouisGariepy) +- refactor: unify generic expr rewrite functions into the `datafusion_expr::expr_rewriter` [#6644](https://github.com/apache/arrow-datafusion/pull/6644) (r4ntix) +- Move `PhysicalPlanner` to `physical_planer` module [#6570](https://github.com/apache/arrow-datafusion/pull/6570) (alamb) +- Update documentation for creating User Defined Aggregates (AggregateUDF) [#6729](https://github.com/apache/arrow-datafusion/pull/6729) (alamb) +- Support User Defined Window Functions [#6703](https://github.com/apache/arrow-datafusion/pull/6703) (alamb) +- Minor: Move `PartitionStream` to physical_plan [#6756](https://github.com/apache/arrow-datafusion/pull/6756) (alamb) + +**Implemented enhancements:** + +- feat: support type coercion in Parquet Reader [#6458](https://github.com/apache/arrow-datafusion/pull/6458) (e1ijah1) +- feat: New functions and operations for working with arrays [#6384](https://github.com/apache/arrow-datafusion/pull/6384) (izveigor) +- feat: `DISTINCT` bitwise and boolean aggregate functions [#6581](https://github.com/apache/arrow-datafusion/pull/6581) (izveigor) +- feat: make_array support empty arguments [#6593](https://github.com/apache/arrow-datafusion/pull/6593) (parkma99) +- feat: encapsulate physical optimizer rules into a struct [#6645](https://github.com/apache/arrow-datafusion/pull/6645) (waynexia) +- feat: new concatenation operator for working with arrays [#6615](https://github.com/apache/arrow-datafusion/pull/6615) (izveigor) +- feat: add `-c option` to pass the SQL query directly as an argument on datafusion-cli [#6765](https://github.com/apache/arrow-datafusion/pull/6765) (r4ntix) + +**Fixed bugs:** + +- fix: ignore panics if racing against catalog/schema changes [#6536](https://github.com/apache/arrow-datafusion/pull/6536) (Weijun-H) +- fix: type coercion support date - date [#6578](https://github.com/apache/arrow-datafusion/pull/6578) (jackwener) +- fix: avoid panic in `list_files_for_scan` [#6605](https://github.com/apache/arrow-datafusion/pull/6605) (Folyd) +- fix: analyze/optimize plan in `CREATE TABLE AS SELECT` [#6610](https://github.com/apache/arrow-datafusion/pull/6610) (jackwener) +- fix: remove type coercion of case expression in Expr::Schema [#6614](https://github.com/apache/arrow-datafusion/pull/6614) (jackwener) +- fix: correct test timestamp_add_interval_months [#6622](https://github.com/apache/arrow-datafusion/pull/6622) (jackwener) +- fix: fix more panics in `ListingTable` [#6636](https://github.com/apache/arrow-datafusion/pull/6636) (Folyd) +- fix: median with even number of `Decimal128` not working [#6634](https://github.com/apache/arrow-datafusion/pull/6634) (izveigor) +- fix: port unstable subquery to sqllogicaltest [#6659](https://github.com/apache/arrow-datafusion/pull/6659) (jackwener) +- fix: correct wrong test [#6667](https://github.com/apache/arrow-datafusion/pull/6667) (jackwener) +- fix: from_plan shouldn't use original schema [#6595](https://github.com/apache/arrow-datafusion/pull/6595) (jackwener) +- fix: correct the error type [#6712](https://github.com/apache/arrow-datafusion/pull/6712) (jackwener) +- fix: parser for negative intervals [#6698](https://github.com/apache/arrow-datafusion/pull/6698) (izveigor) + +**Documentation updates:** + +- Minor: Fix doc for round function [#6661](https://github.com/apache/arrow-datafusion/pull/6661) (viirya) +- Docs: Improve documentation for `struct` function` [#6754](https://github.com/apache/arrow-datafusion/pull/6754) (alamb) + +**Merged pull requests:** + +- fix: ignore panics if racing against catalog/schema changes [#6536](https://github.com/apache/arrow-datafusion/pull/6536) (Weijun-H) +- Remove `avro_to_arrow::reader::Reader::next` in favor of `Iterator` implementation. [#6538](https://github.com/apache/arrow-datafusion/pull/6538) (LouisGariepy) +- Support ordering analysis with expressions (not just columns) by Replace `OrderedColumn` with `PhysicalSortExpr` [#6501](https://github.com/apache/arrow-datafusion/pull/6501) (mustafasrepo) +- Prepare for 26.0.0 release [#6533](https://github.com/apache/arrow-datafusion/pull/6533) (andygrove) +- fix Incorrect function-name matching with disabled enable_ident_normalization [#6528](https://github.com/apache/arrow-datafusion/pull/6528) (parkma99) +- Improve error messages with function name suggestion. [#6520](https://github.com/apache/arrow-datafusion/pull/6520) (2010YOUY01) +- Docs: add more PR guidance in contributing guide (smaller PRs) [#6546](https://github.com/apache/arrow-datafusion/pull/6546) (alamb) +- feat: support type coercion in Parquet Reader [#6458](https://github.com/apache/arrow-datafusion/pull/6458) (e1ijah1) +- Update to object_store 0.6 and arrow 41 [#6374](https://github.com/apache/arrow-datafusion/pull/6374) (tustvold) +- feat: New functions and operations for working with arrays [#6384](https://github.com/apache/arrow-datafusion/pull/6384) (izveigor) +- Add support for appending data to external tables - CSV [#6526](https://github.com/apache/arrow-datafusion/pull/6526) (mustafasrepo) +- [Minor] Update hashbrown to 0.14 [#6562](https://github.com/apache/arrow-datafusion/pull/6562) (Dandandan) +- refactor: use bitwise and boolean compute functions [#6568](https://github.com/apache/arrow-datafusion/pull/6568) (izveigor) +- Fix panic propagation in `CoalescePartitions`, consolidates panic propagation into `RecordBatchReceiverStream` [#6507](https://github.com/apache/arrow-datafusion/pull/6507) (alamb) +- Move `physical_plan::file_format` to `datasource::plan` [#6516](https://github.com/apache/arrow-datafusion/pull/6516) (alamb) +- refactor: remove type_coercion in PhysicalExpr. [#6575](https://github.com/apache/arrow-datafusion/pull/6575) (jackwener) +- Minor: remove `tokio_stream` dependency [#6565](https://github.com/apache/arrow-datafusion/pull/6565) (alamb) +- minor: remove useless mut and borrow() [#6580](https://github.com/apache/arrow-datafusion/pull/6580) (jackwener) +- Add tests for object_store builders of datafusion-cli [#6576](https://github.com/apache/arrow-datafusion/pull/6576) (r4ntix) +- Avoid per-batch field lookups in SchemaMapping [#6563](https://github.com/apache/arrow-datafusion/pull/6563) (tustvold) +- Move `JoinType` and `JoinCondition` to `datafusion_common` [#6572](https://github.com/apache/arrow-datafusion/pull/6572) (alamb) +- chore(deps): update substrait requirement from 0.10.0 to 0.11.0 [#6579](https://github.com/apache/arrow-datafusion/pull/6579) (dependabot[bot]) +- refactor: bitwise kernel right and left shifts [#6585](https://github.com/apache/arrow-datafusion/pull/6585) (izveigor) +- fix: type coercion support date - date [#6578](https://github.com/apache/arrow-datafusion/pull/6578) (jackwener) +- make page filter public [#6523](https://github.com/apache/arrow-datafusion/pull/6523) (jiacai2050) +- Minor: Remove some `use crate::` uses in physical_plan [#6573](https://github.com/apache/arrow-datafusion/pull/6573) (alamb) +- feat: `DISTINCT` bitwise and boolean aggregate functions [#6581](https://github.com/apache/arrow-datafusion/pull/6581) (izveigor) +- Make the struct function return the correct data type. [#6594](https://github.com/apache/arrow-datafusion/pull/6594) (jiangzhx) +- fix: avoid panic in `list_files_for_scan` [#6605](https://github.com/apache/arrow-datafusion/pull/6605) (Folyd) +- fix: analyze/optimize plan in `CREATE TABLE AS SELECT` [#6610](https://github.com/apache/arrow-datafusion/pull/6610) (jackwener) +- Minor: Add additional docstrings to Window function implementations [#6592](https://github.com/apache/arrow-datafusion/pull/6592) (alamb) +- Remove `FromSlice` in favor of `From` impl in upstream arrow-rs code [#6587](https://github.com/apache/arrow-datafusion/pull/6587) (alamb) +- [Minor] Cleanup tpch benchmark [#6609](https://github.com/apache/arrow-datafusion/pull/6609) (Dandandan) +- Revert "feat: Implement the bitwise_not in NotExpr (#5902)" [#6599](https://github.com/apache/arrow-datafusion/pull/6599) (jackwener) +- Port remaining tests in functions.rs to sqllogictest [#6608](https://github.com/apache/arrow-datafusion/pull/6608) (jiangzhx) +- fix: remove type coercion of case expression in Expr::Schema [#6614](https://github.com/apache/arrow-datafusion/pull/6614) (jackwener) +- Minor: use upstream `dialect_from_str` [#6616](https://github.com/apache/arrow-datafusion/pull/6616) (alamb) +- Minor: Move `PlanType`, `StringifiedPlan` and `ToStringifiedPlan` `datafusion_common` [#6571](https://github.com/apache/arrow-datafusion/pull/6571) (alamb) +- fix: correct test timestamp_add_interval_months [#6622](https://github.com/apache/arrow-datafusion/pull/6622) (jackwener) +- Impl `Literal` trait for `NonZero*` types [#6627](https://github.com/apache/arrow-datafusion/pull/6627) (Folyd) +- style: make clippy happy and remove redundant prefix [#6624](https://github.com/apache/arrow-datafusion/pull/6624) (jackwener) +- Substrait: Fix incorrect join key fields (indices) when same table is being used more than once [#6135](https://github.com/apache/arrow-datafusion/pull/6135) (nseekhao) +- Minor: Add debug logging for schema mismatch errors [#6626](https://github.com/apache/arrow-datafusion/pull/6626) (alamb) +- Minor: Move functionality into `BuildInScalarFunction` [#6612](https://github.com/apache/arrow-datafusion/pull/6612) (alamb) +- Add datafusion-cli tests to the CI Job [#6600](https://github.com/apache/arrow-datafusion/pull/6600) (r4ntix) +- Refactor joins test to sqllogic [#6525](https://github.com/apache/arrow-datafusion/pull/6525) (aprimadi) +- fix: fix more panics in `ListingTable` [#6636](https://github.com/apache/arrow-datafusion/pull/6636) (Folyd) +- fix: median with even number of `Decimal128` not working [#6634](https://github.com/apache/arrow-datafusion/pull/6634) (izveigor) +- Unify formatting of both groups and files up to 5 elements [#6637](https://github.com/apache/arrow-datafusion/pull/6637) (qrilka) +- feat: make_array support empty arguments [#6593](https://github.com/apache/arrow-datafusion/pull/6593) (parkma99) +- Minor: cleanup the unnecessary CREATE TABLE aggregate_test_100 statement at aggregate.slt [#6641](https://github.com/apache/arrow-datafusion/pull/6641) (jiangzhx) +- chore(deps): update sqllogictest requirement from 0.13.2 to 0.14.0 [#6646](https://github.com/apache/arrow-datafusion/pull/6646) (dependabot[bot]) +- Improve main api doc page, move `avro_to_arrow` to `datasource` [#6564](https://github.com/apache/arrow-datafusion/pull/6564) (alamb) +- Minor: Move `include_rank` into `BuiltInWindowFunctionExpr` [#6620](https://github.com/apache/arrow-datafusion/pull/6620) (alamb) +- Prioritize UDF over scalar built-in function in case of function nameā€¦ [#6601](https://github.com/apache/arrow-datafusion/pull/6601) (epsio-banay) +- feat: encapsulate physical optimizer rules into a struct [#6645](https://github.com/apache/arrow-datafusion/pull/6645) (waynexia) +- Fix date_trunc signature [#6632](https://github.com/apache/arrow-datafusion/pull/6632) (alamb) +- Return correct scalar types for date_trunc [#6638](https://github.com/apache/arrow-datafusion/pull/6638) (viirya) +- Insert supports specifying column names in any order [#6628](https://github.com/apache/arrow-datafusion/pull/6628) (jonahgao) +- Fix Clippy module inception (unwrap `datasource::datasource` and `catalog::catalog` [#6640](https://github.com/apache/arrow-datafusion/pull/6640) (LouisGariepy) +- Add hash support for PhysicalExpr and PhysicalSortExpr [#6625](https://github.com/apache/arrow-datafusion/pull/6625) (mustafasrepo) +- Port tests in joins.rs to sqllogictes [#6642](https://github.com/apache/arrow-datafusion/pull/6642) (jiangzhx) +- Minor: Add test for date_trunc schema on scalars [#6655](https://github.com/apache/arrow-datafusion/pull/6655) (alamb) +- Simplify and encapsulate window function state management [#6621](https://github.com/apache/arrow-datafusion/pull/6621) (alamb) +- Minor: Move get_equal_orderings into `BuiltInWindowFunctionExpr`, remove `BuiltInWindowFunctionExpr::as_any` [#6619](https://github.com/apache/arrow-datafusion/pull/6619) (alamb) +- minor: use sql to setup test data for joins.slt rather than rust [#6656](https://github.com/apache/arrow-datafusion/pull/6656) (alamb) +- Support wider range of Subquery, handle the Count bug [#6457](https://github.com/apache/arrow-datafusion/pull/6457) (mingmwang) +- fix: port unstable subquery to sqllogicaltest [#6659](https://github.com/apache/arrow-datafusion/pull/6659) (jackwener) +- Minor: Fix doc for round function [#6661](https://github.com/apache/arrow-datafusion/pull/6661) (viirya) +- refactor: unify generic expr rewrite functions into the `datafusion_expr::expr_rewriter` [#6644](https://github.com/apache/arrow-datafusion/pull/6644) (r4ntix) +- Minor: add test cases for coercion bitwise shifts [#6651](https://github.com/apache/arrow-datafusion/pull/6651) (izveigor) +- refactor: unify replace count(\*) analyzer by removing it in sql crate [#6660](https://github.com/apache/arrow-datafusion/pull/6660) (jackwener) +- Combine evaluate_stateful and evaluate_inside_range [#6665](https://github.com/apache/arrow-datafusion/pull/6665) (mustafasrepo) +- Support internal cast for BuiltinScalarFunction::MakeArray [#6607](https://github.com/apache/arrow-datafusion/pull/6607) (jayzhan211) +- minor: use sql to setup test data for aggregate.slt rather than rust [#6664](https://github.com/apache/arrow-datafusion/pull/6664) (jiangzhx) +- Minor: Add tests for User Defined Aggregate functions [#6669](https://github.com/apache/arrow-datafusion/pull/6669) (alamb) +- fix: correct wrong test [#6667](https://github.com/apache/arrow-datafusion/pull/6667) (jackwener) +- fix: from_plan shouldn't use original schema [#6595](https://github.com/apache/arrow-datafusion/pull/6595) (jackwener) +- feat: new concatenation operator for working with arrays [#6615](https://github.com/apache/arrow-datafusion/pull/6615) (izveigor) +- Minor: Add more doc strings to WindowExpr [#6663](https://github.com/apache/arrow-datafusion/pull/6663) (alamb) +- minor: `with_new_inputs` replace `from_plan` [#6680](https://github.com/apache/arrow-datafusion/pull/6680) (jackwener) +- Docs: Update roadmap to point at EPIC's, clarify project goals [#6639](https://github.com/apache/arrow-datafusion/pull/6639) (alamb) +- Disable incremental compilation on CI [#6688](https://github.com/apache/arrow-datafusion/pull/6688) (alamb) +- Allow `AggregateUDF` to define retractable batch , implement sliding window functions [#6671](https://github.com/apache/arrow-datafusion/pull/6671) (alamb) +- Minor: Update user guide [#6692](https://github.com/apache/arrow-datafusion/pull/6692) (comphead) +- Minor: consolidate repartition test into sql_integration to save builder space and build time [#6685](https://github.com/apache/arrow-datafusion/pull/6685) (alamb) +- Minor: combine `statistics`, `filter_pushdown` and `custom_sources provider` tests together to reduce CI disk space [#6683](https://github.com/apache/arrow-datafusion/pull/6683) (alamb) +- Move `PhysicalPlanner` to `physical_planer` module [#6570](https://github.com/apache/arrow-datafusion/pull/6570) (alamb) +- Rename integration tests to match crate they are defined in [#6687](https://github.com/apache/arrow-datafusion/pull/6687) (alamb) +- Minor: combine fuzz tests into a single binary to save builder space and build time [#6684](https://github.com/apache/arrow-datafusion/pull/6684) (alamb) +- Minor: consolidate datafusion_substrait tests into `substrait_integration` to save builder space and build time #6685 [#6686](https://github.com/apache/arrow-datafusion/pull/6686) (alamb) +- removed self.all_values.len() from inside reserve [#6689](https://github.com/apache/arrow-datafusion/pull/6689) (BryanEmond) +- Replace supports_bounded_execution with supports_retract_batch [#6695](https://github.com/apache/arrow-datafusion/pull/6695) (mustafasrepo) +- Move `dataframe` and `dataframe_functon` into `core_integration` test binary [#6697](https://github.com/apache/arrow-datafusion/pull/6697) (alamb) +- refactor: fix clippy allow too many arguments [#6705](https://github.com/apache/arrow-datafusion/pull/6705) (aprimadi) +- Fix documentation typo [#6704](https://github.com/apache/arrow-datafusion/pull/6704) (aprimadi) +- fix: correct the error type [#6712](https://github.com/apache/arrow-datafusion/pull/6712) (jackwener) +- Port test in subqueries.rs from rust to sqllogictest [#6675](https://github.com/apache/arrow-datafusion/pull/6675) (jiangzhx) +- Improve performance/memory usage of HashJoin datastructure (5-15% improvement on selected TPC-H queries) [#6679](https://github.com/apache/arrow-datafusion/pull/6679) (Dandandan) +- refactor: alias() should skip add alias for `Expr::Sort` [#6707](https://github.com/apache/arrow-datafusion/pull/6707) (jackwener) +- chore(deps): update strum/strum_macros requirement from 0.24 to 0.25 [#6717](https://github.com/apache/arrow-datafusion/pull/6717) (jackwener) +- Move alias generator to per-query execution props [#6706](https://github.com/apache/arrow-datafusion/pull/6706) (aprimadi) +- fix: parser for negative intervals [#6698](https://github.com/apache/arrow-datafusion/pull/6698) (izveigor) +- Minor: Improve UX for setting `ExecutionProps::query_execution_start_time` [#6719](https://github.com/apache/arrow-datafusion/pull/6719) (alamb) +- add Eq and PartialEq to ListingTableUrl [#6725](https://github.com/apache/arrow-datafusion/pull/6725) (fsdvh) +- Support Expr::InList to Substrait::RexType [#6604](https://github.com/apache/arrow-datafusion/pull/6604) (jayzhan211) +- MINOR: Add maintains input order flag to CoalesceBatches [#6730](https://github.com/apache/arrow-datafusion/pull/6730) (mustafasrepo) +- Minor: Update copyight date on website [#6727](https://github.com/apache/arrow-datafusion/pull/6727) (alamb) +- Display all partitions and files in EXPLAIN VERBOSE [#6711](https://github.com/apache/arrow-datafusion/pull/6711) (qrilka) +- Update `arrow`, `arrow-flight` and `parquet` to `42.0.0` [#6702](https://github.com/apache/arrow-datafusion/pull/6702) (alamb) +- Move `PartitionEvaluator` and window_state structures to `datafusion_expr` crate [#6690](https://github.com/apache/arrow-datafusion/pull/6690) (alamb) +- Hash Join Vectorized collision checking [#6724](https://github.com/apache/arrow-datafusion/pull/6724) (Dandandan) +- Return null for date_trunc(null) instead of panic [#6723](https://github.com/apache/arrow-datafusion/pull/6723) (BryanEmond) +- `derive(Debug)` for `Expr` [#6708](https://github.com/apache/arrow-datafusion/pull/6708) (parkma99) +- refactor: extract merge_projection common function. [#6735](https://github.com/apache/arrow-datafusion/pull/6735) (jackwener) +- Fix up some `DataFusionError::Internal` errors with correct type [#6721](https://github.com/apache/arrow-datafusion/pull/6721) (alamb) +- Minor: remove some uses of unwrap [#6738](https://github.com/apache/arrow-datafusion/pull/6738) (alamb) +- Minor: remove dead code with decimal datatypes from `in_list` [#6737](https://github.com/apache/arrow-datafusion/pull/6737) (izveigor) +- Update documentation for creating User Defined Aggregates (AggregateUDF) [#6729](https://github.com/apache/arrow-datafusion/pull/6729) (alamb) +- Support User Defined Window Functions [#6703](https://github.com/apache/arrow-datafusion/pull/6703) (alamb) +- MINOR: Aggregate ordering substrait support [#6745](https://github.com/apache/arrow-datafusion/pull/6745) (mustafasrepo) +- chore(deps): update itertools requirement from 0.10 to 0.11 [#6752](https://github.com/apache/arrow-datafusion/pull/6752) (jackwener) +- refactor: move some code in physical_plan/common.rs before tests module [#6749](https://github.com/apache/arrow-datafusion/pull/6749) (aprimadi) +- Add support for order-sensitive aggregation for multipartitions [#6734](https://github.com/apache/arrow-datafusion/pull/6734) (mustafasrepo) +- Update sqlparser-rs to version `0.35.0` [#6753](https://github.com/apache/arrow-datafusion/pull/6753) (alamb) +- Docs: Update SQL status page [#6736](https://github.com/apache/arrow-datafusion/pull/6736) (alamb) +- fix typo [#6761](https://github.com/apache/arrow-datafusion/pull/6761) (Weijun-H) +- Minor: Move `PartitionStream` to physical_plan [#6756](https://github.com/apache/arrow-datafusion/pull/6756) (alamb) +- Docs: Improve documentation for `struct` function` [#6754](https://github.com/apache/arrow-datafusion/pull/6754) (alamb) +- add UT to verify the fix on "issues/6606" [#6762](https://github.com/apache/arrow-datafusion/pull/6762) (mingmwang) +- Re-export modules individually to fix rustdocs [#6757](https://github.com/apache/arrow-datafusion/pull/6757) (alamb) +- Order Preserving RepartitionExec Implementation [#6742](https://github.com/apache/arrow-datafusion/pull/6742) (mustafasrepo) +- feat: add `-c option` to pass the SQL query directly as an argument on datafusion-cli [#6765](https://github.com/apache/arrow-datafusion/pull/6765) (r4ntix)