19.0.0 (2023-02-24)
Breaking changes:
- Use DataFusionError instead of ArrowError in SendableRecordBatchStream #5101 (comphead)
- Update to arrow 32 and Switch to RawDecoder for JSON #5056 [sql] (tustvold)
- Allow
SessionContext::read_csv
, etc to read multiple files #4908 (saikrishna1-bidgely)
Implemented enhancements:
- Ignore Arrow in dependabot #5340
- Provide access to internal fields of SessionContext #5317
- Investigate performance drop for DISTINCT queries #5313
- [DOC] Update math expression documentation #5312
- Replace merge_batches with concat_batches #5297
- Support for some of the window frame range queries #5275
- Make
log
function to be in sync with PostgresSql #5259 - [SQLLogicTest] Make schema validation ignore nullable and metadata attributes #5231
- Add support for linear groups search #5213
- Add SQL function overload
LOG(base, x)
for logarithm of x to base #5206 all_schema()
will get schema of child of child of .... #5192- Enable parquet parallel scans by default #5125
- Don't repartition ProjectionExec when it does not compute anything #4968
- Support non-tuple expression for Exists Subquery to Join #4934
- Read multiple files/folders using
read_csv
#4909
Fixed bugs:
- Make inline_table_scan optimize whole plan during first optimization stage. #5364
- tpcds_logical_q8 ambiguous name. #5334
- Protobuf serialisation is missing for GetIndexedFieldExpr #5323
- Indexing a nested list with 0 or an index larger than list size is not handled correctly #5310
- Protobuf serialization drops
preserve_partitioning
fromSortExec
#5305 - data file without suffix can't be read correctly #5301
- Idk #5298
- Error with query that has DISTINCT with ORDER BY and aliased select list #5293
- Optimizer prunes UnnestExec on aggregate count #5281
- Strange Behaviour on RepartitionExec with CoalescePartitionsExec. #5278
- Error "For SELECT DISTINCT, ORDER BY expressions id must appear in select list" may be over eager #5255
- SQL allows SORT BY keyword #5247
- test
sort_on_window_null_string
failed after disableskip_fail
. #5233 - Dataframe API adds ?table? qualifier #5187
- Re-ordering Projections in scan are not working anymore (since DF15) #5146
- parquet page level skipping (page index pruning) doesn't work with evolved schemas #5104
- Incorrect results on queries with
distinct
and orderby #5065 - NestedLoopJoin will panic when right child contains RepartitionExec #5022
- JSON projection only work when the index is in ascending order #4832
- Stack overflows when planning tpcds 22 in debug mode #4786
- Failed to create Left anti join physical plan due to SchemaError::FieldNotFound #4366
- Filters/limit are not pushdown druing optimalization for table with alias #2270
Documentation updates:
- Update README.md fix [welcoming community] links #5232 (jiangzhx)
- Update README.md update blaze-rs link to https://github.com/blaze-init/blaze #5190 (jiangzhx)
- Typo of greptimedb #5103 (fengjiachun)
- chore: change
DataBend
toDatabend
#5096 (xudong963)
Closed issues:
- Implement column number / column type verification for sqllogictest #4499
Merged pull requests:
- generate new projection plan in inline_table_scan instead of discarding #5371 (jackwener)
- minor: fix rule name and comment. #5370 (jackwener)
- minor: port limit tests to sqllogictests #5355 (jackwener)
- feat: add rule to merge projection. #5349 (jackwener)
- Ignore Arrow in dependabot #5341 (iajoiner)
- minor: remove useless
.get()
#5336 (jackwener) - bugfix: fix tpcds_logical_q8 ambiguous name. #5335 (jackwener)
- minor: disable tpcds_logical_q10/q35 #5333 (jackwener)
- minor: port intersection sql tests to sqllogictests #5331 (alamb)
- minor: port more window tests to sqllogictests #5330 (alamb)
- MINOR: nicer error messages for cli, use display format rather than debug #5329 (kmitchener)
- Add missing protobuf serialisation functionality GetIndexedFieldExpr. #5324 (ahmedriza)
- chore: small typo in the example README #5319 (gianarb)
- feat: add accessor to SessionContext fields for ContextProvider impl #5318 (sunng87)
- [DOC] Update math expression documentation #5316 (comphead)
- Fix nested list indexing when the index is 0 or larger than the list size #5311 (ahmedriza)
- Fix SortExec bench case and Add SortExec input cases to bench for SortPreservingMergeExec #5308 (jaylmiller)
- Allow DISTINCT with ORDER BY and an aliased select list #5307 [sql] (alamb)
- Serialize preserve_partitioning in SortExec #5306 (thinkharderdev)
- fix: correct plan builder when test
scalar_subquery_project_expr
#5304 (jackwener) - Make SQL query consistent with API syntax expression in code examples #5303 (ongchi)
- enable tpcds-64 test #5302 (jackwener)
- Feature/merge batches removal #5300 (berkaysynnada)
- fix: add yield point to
RepartitionExec
#5299 (crepererum) datafusion.optimizer.repartition_file_scans
enabled by default #5295 (korowa)- minor: derive Ord/PartialOrd/Eq/PartialEq traits for
ObjectStoreUrl
#5288 (crepererum) - Fix the potential bug of check_all_column_from_schema #5287 (ygf11)
- Linear search support for Window Group queries #5286 [sql] (mustafasrepo)
- Prevent optimizer from pruning UnnestExec. #5282 (vincev)
- Minor: Add fetch to SortExec display #5279 (thinkharderdev)
- Set
catalog_list
from outside forSessionState
. #5277 (MichaelScofield) - Support page skipping / page_index pushdown for evolved schemas #5268 (alamb)
- Use upstream newline_delimited_stream #5267 (tustvold)
- Support non-tuple expression for exists-subquery to join #5264 (ygf11)
- minor: Fix cargo fmt #5263 (alamb)
- minor: replace
unwrap()
with?
#5262 (jackwener) - Preserve
TableScan.projection
order inpush_down_projection
optimizer rule #5261 (korowa) - Minor: refactor ParquetExec roundtrip tests #5260 (alamb)
- [fix][plan] relax the check for distinct, order by for dataframe #5258 [sql] (xiaoyong-z)
- enhance the checking of type errors in the test
window_frame_creation
#5257 (HaoYang670) - SQL planning benchmarks for very wide tables #5256 (alamb)
- Minor: Add negative test for SORT BY #5254 (alamb)
- [sqllogictest] Define output types and check them in tests #5253 (melgenek)
- Minor: port some explain test to sqllogictest, add filename normalization #5252 (alamb)
- Disallow SORT BY in SQL #5249 [sql] (Jefffrey)
- [SQLLogicTest] Make schema validation ignore nullable and metadata attributes #5246 (comphead)
- Add SQL function overload LOG(base, x) for logarithm of x to base #5245 (comphead)
- Update sqllogictest requirement from 0.11.1 to 0.12.0 #5237 #5244 (alamb)
- Test case for NDJsonExec with randomly ordered projection #5243 (korowa)
- Update to arrow
33.0.0
#5241 [sql] (tustvold) - DataFusion 18.0.0 Release #5240 [sql] (andygrove)
- fix clippy in nightly #5238 (jackwener)
- refactor: correct the implementation of
all_schemas()
#5236 (jackwener) - bugfix: fix error when
get_coerced_window_frame
meetutf8
#5234 (jackwener) - Feature/sort enforcement refactor #5228 (mustafasrepo)
- Minor: Fix doc links and typos #5225 (Jefffrey)
- fix: correct expected error in test #5224 (jackwener)
- bugfix: fix propagating empty_relation generates an illegal plan #5219 (yukkit)
- Replace placeholders in ScalarSubqueries #5216 [sql] (avantgardnerio)
- Dataframe join_on method #5210 [sql] (Jefffrey)
- bugfix: fix eval
nullalbe()
insimplify_exprs
#5208 (jackwener) - minor: remove unnecessary clone #5207 (Ted-Jiang)
- minor: extract
merge_schema()
function. #5203 (jackwener) - minor: remove unnecessary
continue
#5200 (xiaoyong-z) - Minor: Begin porting some window tests to sqllogictests #5199 (alamb)
- fix(MemTable): make it cancel-safe and fix parallelism #5197 (DDtKey)
- fix: make
write_csv/json/parquet
cancel-safe #5196 (DDtKey) - Support arithmetic operation on DictionaryArray #5194 (viirya)
- sqllogicaltest: add cleanup and use rowsort. #5189 (jackwener)
- bugfix: fix
TableScan
may contain fields not included inschema
#5188 (jackwener) - Create disk manager spill folder if doesn't exist #5185 (comphead)
- Parse identifiers properly for TableReferences #5183 [sql] (Jefffrey)
- Fix decimal scalar dyn kernels #5179 (viirya)
- Patch git Safe Paths in CI #5177 (tustvold)
- Add initial support for serializing physical plans with Substrait #5176 (andygrove)
- Bump tokio from 1.24.1 to 1.24.2 in /datafusion-cli #5172 (dependabot[bot])
- Make EnforceSorting global sort aware, fix sort mis-optimizations involving unions, support parallel sort + merge transformations #5171 (mustafasrepo)
- Update substrait README.md #5168 (jiangzhx)
- Switch to use sum kernel from arrow-rs for Decimal128 #5167 (sunchao)
- FileStream: Open next file in parallel while decoding #5161 (thinkharderdev)
- Fix FairSpillPool try_grow for non-spillable consumers #5160 (tustvold)
- fix: treat unsupported SQL plans as "not implemented" #5159 (crepererum)
- Compare NULL types #5158 (melgenek)
- Always wrapping OnceAsync for the inner table side in NestedLoopJoinExec #5156 (ygf11)
- chore: add object_name_to_table_reference in SqlToRel #5155 [sql] (jiacai2050)
- Ambiguity check for where selection #5153 [sql] (Jefffrey)
- feat: Type coercion for Dictionary(_, _) to Utf8 for regex conditions #5152 (stuartcarnie)
- Support arithmetic scalar operation with DictionaryArray #5151 (viirya)
- [sqllogictest] Support
pg_typeof
#5148 (melgenek) - Date to Timestamp cast #5140 (comphead)
- add example for Flight SQL server that supports JDBC driver #5138 (kmitchener)
- Add in-list test #5135 (nseekhao)
- [BugFix] abort plan if order by column not in select list #5132 [sql] (xiaoyong-z)
- Bug fix: Empty Record Batch handling #5131 (mustafasrepo)
- Add option to control whether to normalize ident #5124 [sql] (jiacai2050)
- Make
parse_physical_expr
public #5118 (comphead) - Support coercing
utf8
tointerval
andtimestamp
(including arguments todate_bin
) #5117 (alamb) - Fix release issues #5116 [sql] (andygrove)
- minor: port date_bin tests to sqllogictests #5115 (alamb)
- Minor: reduce code duplication using
rewrite_expr
#5114 [sql] (alamb) - Replace &Option<T> with Option<&T> #5113 (gaoxinge)
- Improve
get_meet_of_orderings
to check for common prefixes #5111 (ozankabak) - [sqllogictest] Apply rowsort when there is no explicit order by #5110 (melgenek)
- Add unnest_column to DataFrame #5106 (vincev)
- Minor: reduce indent level in page filter pruning code #5105 (alamb)
- Replace &Option<T> with Option<&T> #5102 (gaoxinge)
- Minor: remove unused methods in datafusion/optimizer/src/utils.rs #5098 (ygf11)
- ci: don't trigger rust ci for doc changes #5097 (xudong963)
- sqllogicaltest: fix unstable slt case. #5095 (jackwener)
- chore: update cranelift-module #5094 (jackwener)
- refactor: Add
rewrite_expr
convenience method for rewritingExpr
s #5092 (alamb) - Minor: extract sort col rewrite into its own module, add unit tests #5088 (alamb)
- [sqllogictest] Move
decimal.rs
tests #5086 (melgenek) - Insert target columns empty fix #5079 [sql] (gruuya)
- sqllogicaltest: move union.rs #5075 (jackwener)
- [Enhancement] Don't repartition ProjectionExec when it does not compute anything #5074 (xiaoyong-z)
- Support ORDER BY an aliased column #5067 [sql] (alamb)
- Parquet parallel scan #5057 (korowa)
- [BugFix] fix file stream time scanning metrics bug #5020 (xiaoyong-z)
- Show optimization errors in explain #4819 [sql] (Jefffrey)