rust-v0.17.0 (2024-02-06)
The 0.17.0 release moves storage implementations into their own crates, such as
deltalake-aws
. A consequence of that refactoring is that custom storage and
file scheme handlers must be registered/initialized at runtime. Storage
subcrates conventionally define a register_handlers
function which performs
that task. Users may see errors such as:
thread 'main' panicked at /home/ubuntu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/deltalake-core-0.17.0/src/table/builder.rs:189:48:
The specified table_uri is not valid: InvalidTableLocation("Unknown scheme: s3")
- Users of the meta-crate (
deltalake
) can call the storage crate via:deltalake::aws::register_handlers(None);
at the entrypoint for their code. - Users who adopt
core
and storage crates independently (e.g.deltalake-aws
) can register viadeltalake_aws::register_handlers(None);
.
The AWS, Azure, and GCP crates must all have their custom file schemes registered in this fashion.
The locking mechanism is fundamentally different between deltalake
v0.16.x and v0.17.0, starting with this release the deltalake
and deltalake-aws
crates this library now relies on the same protocol for concurrent writes on AWS as the Delta Lake/Spark implementation.
Fundamentally the DynamoDB table structure changes, which is documented here. The configuration of a Rust process should continue to use the AWS_S3_LOCKING_PROVIDER
environment value of dynamodb
. The new table must be specified with the DELTA_DYNAMO_TABLE_NAME
environment or configuration variable, and that should name the new S3DynamoDbLogStore
compatible DynamoDB table.
Because locking is required to ensure safe cconsistent writes, there is no iterative migration, 0.16 and 0.17 writers cannot safely coexist. The following steps should be taken when upgrading:
- Stop all 0.16.x writers
- Ensure writes are completed, and lock table is empty.
- Deploy 0.17.0 writers
Implemented enhancements:
- Expose the ability to compile DataFusion with SIMD #2118
- Updating Table log retention configuration with
write_deltalake
silently changes nothing #2108 - ALTER table, ALTER Column, Add/Modify Comment, Add/remove/rename partitions, Set Tags, Set location, Set TBLProperties #2088
- Docs: Update docs for check constraints #2063
- Don't
ensure_table_uri
when creating a tablewith_log_store
#2036 - Exposing custom_metadata in merge operation #2031
- Support custom table properties via TableAlterer and write/merge #2022
- Remove parquet2 crate support #2004
- Merge operation that only touches necessary partitions #1991
- store userMetadata on write operations #1990
- Create Dask integration page #1956
- Merge: Filtering on partitions #1918
- Rethink the load_version and load_with_datetime interfaces #1910
- docs: Delta Lake + Arrow Integration #1908
- docs: Delta Lake + Polars integration #1906
- Rethink decision to expose the public interface in namespaces #1900
- Add documentation on how to build and run documentation locally #1893
- Add API to create an empty Delta Lake table #1892
- Implementing CHECK constraints #1881
- Check Invariants are respecting table features for write paths #1880
- Organize docs with single lefthand sidebar #1873
- Make sure invariants are handled properly throughout the codebase #1870
- Unable to use deltalake
Schema
inwrite_deltalake
#1862 - Add a Rust-backed engine for write_deltalake #1861
- Run doctest in CI for Python API examples #1783
- [RFC] Use arrow for checkpoint reading and state handling #1776
- Expose Python exceptions in public module #1771
- Expose cleanup_metadata or create_checkpoint_from_table_uri_and_cleanup to the Python API #1768
- Expose convert_to_delta to Python API #1767
- Add high-level checking for append-only tables #1759
Fixed bugs:
- Row order no longer preserved after merge operation #2165
- Error when reading delta table with IDENTITY column #2152
- Merge on IS NULL condition doesn't work for empty table #2148
- JsonWriter converts structured parsing error into plain string #2143
- Pandas import error when merging tables #2112
- test_repair_on_update broken in main #2109
WriteBuilder::with_input_execution_plan
does not apply the schema to the log's metadata fields #2105- MERGE logical plan vs execution plan schema mismatch #2104
- Partitions not pushed down #2090
- Cant create empty table with write_deltalake #2086
- Unexpected high costs on Google Cloud Storage #2085
- Unable to read s3 table:
Unknown scheme: s3
#2065 - write_deltalake not respecting writer_properties #2064
- Unable to read/write tables with the "gs" schema in the table_uri in 0.15.1 #2060
- LockClient requiered error for S3 backend in 0.15.1 python #2057
- Error while writing Pandas DataFrame to Delta Lake (S3) #2051
- Error with dynamo locking provider on 0.15 #2034
- Conda version 0.15.0 is missing files #2021
- Rust panicking through Python library when a delete predicate uses a nullable field #2019
- No snapshot or version 0 found, perhaps /Users/watsy0007/resources/test_table/ is an empty dir? #2016
- Generic DeltaTable error: type_coercion in Struct column in merge operation #1998
- Constraint expr not formatted during commit action #1971
- .load_with_datetime() is incorrectly rounding to nearest second #1967
- vacuuming log files #1965
- Unable to merge uppercase column names #1960
- Schema error: Invalid data type for Delta Lake: Null #1946
- Python v0.14 wheel files not up to date #1945
- python Release 0.14 is missing Windows wheels #1942
- CI integration test fails randomly: test_restore_by_datetime #1925
- Merge data freezes indefenetely #1920
- Load DeltaTable from non-existing folder causing empty folder creation #1916
- Reoptimizes merge bins with only 1 file, even though they have no effect. #1901
- The Python Docs link in README.MD points to old docs #1898
- optimize.compact() fails with bad schema after updating to pyarrow 8.0 #1889
- Python build is broken on main #1856
- Checkpoint error with Azure Synapse #1847
- merge very slow compared to delete + append on larger dataset #1846
- get_add_actions fails with deltalake 0.13 #1835
- Handle PyArrow CVE-2023-47248 #1834
- Delta-rs writer hangs with to many file handles open (Azure) #1832
- Encountering NotATable("No snapshot or version 0 found, perhaps xxx is an empty dir?") #1831
- write_deltalake is not creating checkpoints #1815
- Problem writing tables in directory named with char
~
#1806 - DeltaTable Merge throws in merging if there are uppercase in Schema. #1797
- rust merge error - datafusion panics #1790
- expose use_dictionary=False when writing Delta Table and running optimize #1772
Closed issues:
- Is this print necessary? Can we remove this. #2110
- Azure concurrent writes #2069
- Fix docs deployment #1867
- Add a header in old docs and direct users to new docs #1865
rust-v0.16.5 (2023-11-15)
Implemented enhancements:
- When will upgrade object_store to 0.8? #1858
- No Official Help #1849
- Auto assign GitHub issues with a "take" message #1791
Fixed bugs:
- cargo clippy fails on core in main #1843
rust-v0.16.4 (2023-11-12)
Implemented enhancements:
- Unable to add deltalake git dependency to cargo.toml #1821
rust-v0.16.3 (2023-11-08)
Implemented enhancements:
Fixed bugs:
- Code Owners no longer valid #1794
MERGE
works incorrectly with partitioned table if the data column order is not same as table column order #1787- errors when using pyarrow dataset as a source #1779
- Write to Microsoft OneLake failed. #1764
rust-v0.16.2 (2023-10-21)
rust-v0.16.1 (2023-10-21)
rust-v0.16.0 (2023-09-27)
Implemented enhancements:
- Expose Optimize option min_commit_interval in Python #1640
- Expose create_checkpoint_for #1513
- integration tests regularly fail for HDFS #1428
- Add Support for Microsoft OneLake #1418
- add support for atomic rename in R2 #1356
Fixed bugs:
- Writing with large arrow types (e.g. large_utf8), writes wrong partition encoding #1669
- [python] Different stringification of partition values in reader and writer #1653
- Unable to interface with data written from Spark Databricks #1651
get_last_checkpoint
does some unnecessary listing #1643PartitionWriter
'sbuffer_len
doesn't include incomplete row groups #1637- Slack community invite link has expired #1636
- delta-rs does not appear to support tables with liquid clustering #1626
- Internal Parquet panic when using a Map type. #1619
- partition_by with "$" on local filesystem #1591
- ProtocolChanged error when perfoming append write #1585
- Unable to
cargo update
using git tag or rev on Rust 1.70 #1580 - NoMetadata error when reading detlatable #1562
- Cannot read delta table:
Delta protocol violation
#1557 - Update the CODEOWNERS to capture the current reviewers and contributors #1553
- [Python] Incorrect file URIs when partition values contain escape character #1533
- add documentation how to Query Delta natively from datafusion #1485
- Python: write_deltalake to ADLS Gen2 issue #1456
- Partition values that have been url encoded cannot be read when using deltalake #1446
- Error optimizing large table #1419
- Cannot read partitions with special characters (including space) with pyarrow >= 11 #1393
- ImportError: deltalake/_internal.abi3.so: cannot allocate memory in static TLS block #1380
- Invalid JSON in log record missing field
schemaString
for DLT tables #1302 - Special characters in partition path not handled locally #1299
Merged pull requests:
- chore: bump rust crate version #1675 (rtyler)
- fix: change partitioning schema from large to normal string for pyarrow<12 #1671 (ion-elgreco)
- feat: allow to set large dtypes for the schema check in
write_deltalake
#1668 (ion-elgreco) - docs: small consistency update in guide and readme #1666 (ion-elgreco)
- fix: exception string in writer.py #1665 (sebdiem)
- chore: increment python library version #1664 (wjones127)
- docs: fix some typos #1662 (ion-elgreco)
- fix: more consistent handling of partition values and file paths #1661 (roeap)
- docs: add docstring to protocol method #1660 (MrPowers)
- docs: make docs.rs build docs with all features enabled #1658 (simonvandel)
- fix: enable offset listing for s3 #1654 (eeroel)
- chore: fix the incorrect Slack link in our readme #1649 (rtyler)
- fix: compensate for invalid log files created by Delta Live Tables #1647 (rtyler)
- chore: proposed updated CODEOWNERS to allow better review notifications #1646 (rtyler)
- feat: expose min_commit_interval to
optimize.compact
andoptimize.z_order
#1645 (ion-elgreco) - fix: avoid excess listing of log files #1644 (eeroel)
- fix: introduce support for Microsoft OneLake #1642 (rtyler)
- fix: explicitly require chrono 0.4.31 or greater #1641 (rtyler)
- fix: include in-progress row group when calculating in-memory buffer length #1638 (BnMcG)
- chore: relax chrono pin to 0.4 #1635 (houqp)
- chore: update datafusion to 31, arrow to 46 and object_store to 0.7 #1634 (houqp)
- docs: update Readme #1633 (dennyglee)
- chore: pin the chrono dependency #1631 (rtyler)
- feat: pass known file sizes to filesystem in Python #1630 (eeroel)
- feat: implement parsing for the new
domainMetadata
actions in the commit log #1629 (rtyler) - ci: fix python release #1624 (wjones127)
- ci: extend azure timeout #1622 (wjones127)
- feat: allow multiple incremental commits in optimize #1621 (kvap)
- fix: change map nullable value to false #1620 (cmackenzie1)
- Introduce the changelog for the last couple releases #1617 (rtyler)
- chore: bump python version to 0.10.2 #1616 (wjones127)
- perf: avoid holding GIL in DeltaFileSystemHandler #1615 (wjones127)
- fix: don't re-encode paths #1613 (wjones127)
- feat: use url parsing from object store #1592 (roeap)
- feat: buffered reading of transaction logs #1549 (eeroel)
- feat: merge operation #1522 (Blajda)
- feat: expose create_checkpoint_for to the public #1514 (haruband)
- docs: update Readme #1440 (roeap)
- refactor: re-organize top level modules #1434 (roeap)
- feat: integrate unity catalog with datafusion #1338 (roeap)
rust-v0.15.0 (2023-09-06)
Implemented enhancements:
- Configurable number of retries for transaction commit loop #1595
Fixed bugs:
- Unable to read table using VM Managed Identity on Azure #1462
- Unable to query by partition column #1445
Merged pull requests:
- fix: update python test #1608 (wjones127)
- chore: update datafusion to 30, arrow to 45 #1606 (scsmithr)
- fix: just make pyarrow 12 the max #1603 (wjones127)
- fix: support partial statistics in JSON #1599 (CurtHagenlocher)
- feat: allow configurable number of
commit
attempts #1596 (cmackenzie1) - fix: querying on date partitions (fixes #1445) #1594 (watfordkcf)
- refactor: clean up arrow schema defs #1590 (polynomialherder)
- feat: add metadata for operations::write::WriteBuilder #1584 (abhimanyusinghgaur)
- feat: add metadata for deletion vectors #1583 (aersam)
- fix: remove alpha classifier #1578 (marcelotrevisani)
- refactor: use pa.table.cast in delta_arrow_schema_from_pandas #1573 (ion-elgreco)
rust-v0.14.0 (2023-08-01)
Implemented enhancements:
Fixed bugs:
- Excessive integration test sizes causing builds to fail #1550
- Slack invite link is not working #1530
Merged pull requests:
- fix: correct whitespace in delta protocol reader minimum version error message #1576 (polynomialherder)
- chore: move deps to
[workspace.dependencies]
#1575 (cmackenzie1) - chore: update
datafusion
to28
and arrow to43
#1571 (cmackenzie1) - ci: don't run benchmark in debug mode #1566 (wjones127)
- ci: install newer rust for macos python release #1565 (wjones127)
- feat: make find_files public #1560 (yjshen)
- feat!: bulk delete for vacuum #1556 (Blajda)
- chore: address some integration test bloat of disk usage for development #1552 (rtyler)
- docs: port docs to mkdocs #1548 (MrPowers)
- chore: disable incremental builds in CI for saving space #1545 (rtyler)
- fix: revert premature merge of an attempted fix for binary column statistics #1544 (rtyler)
- chore: increment python version #1542 (wjones127)
- feat: add restore command in python binding #1529 (loleek)
rust-v0.13.1 (2023-07-18)
Fixed bugs:
- Revert premature merge of an attempted fix for binary column statistics #1544
rust-v0.13.0 (2023-07-15)
Implemented enhancements:
- Add nested struct supports #1518
- Support FixedLenByteArray UUID statistics as a logical scalar #1483
- Exposing create_add in the API #1458
- Update features table on README #1404
- docs(python): show data catalog options in Python API reference #1347
- Add optimization to only list log files starting at a certain name #1252
- Support configuring parquet compression #1235
- parallel processing in Optimize command #1171
Fixed bugs:
- get_add_actions() MAX is not showing complete value #1534
- Can't get stats's minValues in add actions #1515
- Pyarrow is_null filter not working as expected after loading using deltalake #1496
- Can't write to table that uses generated columns #1495
- Json error: Binary is not supported by JSON when writing checkpoint files #1493
- _last_checkpoint size field is incorrect #1468
- Error when Z Ordering a larger dataset #1459
- Timestamp parsing issue #1455
- File options are ignored when writing delta #1444
- Slack Invite Link No Longer Valid #1425
cleanup_metadata
doesn't remove.checkpoint.parquet
files #1420- The test of reading the data from the blob storage located in Azurite container failed #1415
- The test of reading the data from the bucket located in Minio container failed #1408
- Datafusion: unreachable code reached when parsing statistics with missing columns #1374
- vacuum is very slow on Cloudflare R2 #1366
Closed issues:
- Expose Compression Options or WriterProperties for writing to Delta #1469
- Support out-of-core Z-order using DataFusion #1460
- Expose Z-order in Python #1442
Merged pull requests:
- chore: fix the latest clippy warnings with the newer rustc's #1536 (rtyler)
- docs: show data catalog options in Python API reference #1532 (omkar-foss)
- fix: handle nulls in file-level stats #1520 (wjones127)
- feat: add nested struct supports #1519 (haruband)
- fix: tiny typo in AggregatedStats #1516 (haruband)
- refactor: unify with_predicate for delete ops #1512 (Blajda)
- chore: remove deprecated table functions #1511 (roeap)
- chore: update datafusion and related crates #1504 (roeap)
- feat: implement restore operation #1502 (loleek)
- chore: fix mypy failure #1500 (wjones127)
- fix: avoid writing statistics for binary columns to fix JSON error #1498 (ChewingGlass)
- feat(rust): expose WriterProperties method on RecordBatchWriter and DeltaWriter #1497 (theelderbeever)
- feat: add UUID statistics handling #1484 (atefsaw)
- feat: expose create_add to the public #1482 (atefsaw)
- fix: add
sizeInBytes
to _last_checkpoint and changesize
to # of actions #1477 (cmackenzie1) - fix(python): match Field signatures #1463 (guilhem-dvr)
- feat: handle larger z-order jobs with streaming output and spilling #1461 (wjones127)
- chore: increment python version #1449 (wjones127)
- chore: upgrade to arrow 40 and datafusion 26 #1448 (rtyler)
- feat(python): expose z-order in Python #1443 (wjones127)
- ci: prune CI/CD pipelines #1433 (roeap)
- refactor: remove
LoadCheckpointError
andApplyLogError
#1432 (roeap) - feat: update writers to include compression method in file name #1431 (Blajda)
- refactor: move checkpoint and errors into separate module #1430 (roeap)
- feat: add z-order optimize #1429 (wjones127)
- fix: casting when data to be written does not match table schema #1427 (Blajda)
- docs: update README.adoc to fix expired Slack link #1426 (dennyglee)
- chore: remove no-longer-necessary build.rs for Rust bindings #1424 (rtyler)
- chore: remove the delta-checkpoint lambda which I have moved to a new repo #1423 (rtyler)
- refactor: rewrite redundant_async_block #1422 (cmackenzie1)
- fix: update cleanup regex to include
checkpoint.parquet
files #1421 (cmackenzie1) - docs: update features table in README #1414 (ognis1205)
- fix:
get_prune_stats
returns homogenousArrayRef
#1413 (cmackenzie1) - feat: explicit python exceptions #1409 (roeap)
- feat: implement update operation #1390 (Blajda)
- feat: allow concurrent file compaction #1383 (wjones127)
rust-v0.12.0 (2023-05-30)
Implemented enhancements:
- Release delta-rs
0.11.0
(next release after0.10.0
) #1362 - Support writing statistics for date columns in Rust #1209
Fixed bugs:
- Rust writer in operations makes a lot of data copies #1394
- Unable to read timestamp fields from column statistics #1372
- Unable to write custom metadata via configuration since version 0.9.0 #1353
- .get_add_actions() returns wrong column statistics when dataSkippingNumIndexedCols property of the table was changed #1223
- Ensure decimal statistics are written correctly in Rust #1208
Merged pull requests:
- feat: add list_with_offset to DeltaObjectStore #1410 (ognis1205)
- chore: type-check friendlier exports #1407 (roeap)
- chore: remove ancillary crates from the git tree #1406 (rtyler)
- chore: bump the version for the next release #1405 (rtyler)
- feat: more efficient parquet writer and more statistics #1397 (wjones127)
- perf: improve record batch partitioning #1396 (roeap)
- chore: bump datafusion to 25 #1389 (roeap)
- refactor!: remove
DeltaDataType
aliases #1388 (cmackenzie1) - feat: vacuum with concurrent requests #1382 (wjones127)
- feat: add datafusion storage catalog #1381 (roeap)
- docs: updated schema.rs to use the right signature for decimal data type in documentation #1377 (rahulj51)
- fix: delete operation when partition and non partition columns are used #1375 (Blajda)
- fix: add conversion for string for
Field::TimestampMicros
(#1372) #1373 (cmackenzie1) - fix: allow user defined config keys #1365 (roeap)
- ci: disable full debug symbol generation #1364 (roeap)
- fix: include stats for all columns (#1223) #1342 (mrjoe7)
rust-v0.11.0 (2023-05-12)
Implemented enhancements:
- Implement simple delete case #832
Merged pull requests:
- chore: update Rust package version #1346 (rtyler)
- fix: replace deprecated arrow::json::reader::Decoder #1226 (rtyler)
- feat: delete operation #1176 (Blajda)
- feat: add
wasbs
to known schemes #1345 (iajoiner) - test: add some missing unit and doc tests for DeltaTablePartition #1341 (rtyler)
- feat: write command improvements #1267 (roeap)
- feat: added support for Databricks Unity Catalog #1331 (nohajc)
- fix: double url encode of partition key #1324 (mrjoe7)
rust-v0.10.0 (2023-05-02)
Implemented enhancements:
- Support Optimize on non-append-only tables #1125
Fixed bugs:
- DataFusion integration incorrectly handles partition columns defined "first" in schema #1168
- Datafusion: SQL projection returns wrong column for partitioned data #1292
- Unable to query partitioned tables #1291
Merged pull requests:
- chore: add deprecation notices for commit logic on
DeltaTable
#1323 (roeap) - fix: handle local paths on windows #1322 (roeap)
- fix: scan partitioned tables with datafusion #1303 (roeap)
- fix: allow special characters in storage prefix #1311 (wjones127)
- feat: upgrade to Arrow 37 and Datafusion 23 #1314 (rtyler)
- Hide the parquet/json feature behind our own JSON feature #1307 (rtyler)
- Enable the json feature for the parquet crate #1300 (rtyler)
rust-v0.9.0 (2023-04-14)
Implemented enhancements:
- hdfs support #300
- Add decimal primitive type to document #1280
- Improve error message when filtering on non-existant partition columns #1218
Fixed bugs:
- Datafusion table provider: issues with timestamp types #441
- Not matching column names when creating a RecordBatch from MapArray #1257
- All stores created using
DeltaObjectStore::new
have an identicalobject_store_url
#1188
Merged pull requests:
- Upgrade datafusion to 22 which brings arrow upgrades with it #1249 (rtyler)
- chore: df / arrow changes after update #1288 (roeap)
- feat: read schema from parquet files in datafusion scans #1266 (roeap)
- HDFS storage support via datafusion-objectstore-hdfs #1279 (iajoiner)
- Add description of decimal primitive to SchemaDataType #1281 (ognis1205)
- Fix names and nullability when creating RecordBatch from MapArray #1258 (balbok0)
- Simplify the Store Backend Configuration code #1265 (mrjoe7)
- feat: optimistic transaction protocol #632 (roeap)
- Write support for additional Arrow datatypes #1044(chitralverma)
- Unique delta object store url #1212 (gruuya)
- improve err msg on use of non-partitioned column #1221 (marijncv)
rust-v0.8.0 (2023-03-10)
Implemented enhancements:
- feat(rust): support additional types for partition values #1170
Fixed bugs:
- File pruning does not occur on partition columns #1175
- Bug: Error loading Delta table locally #1157
- Deltalake 0.7.0 with s3 feature compliation error due to rusoto_dynamodb version conflict #1191
- Writing from a Delta table scan using WriteBuilder fails due to missing object store #1186
Merged pull requests:
- build(deps): bump datafusion #1217 (roeap)
- Implement pruning on partition columns #1179 (Blajda)
- feat: enable passing storage options to Delta table builder via Datafusion's CREATE EXTERNAL TABLE #1043 (gruuya)
- feat: typed commit info #1207 (roeap)
- add boolean, date, timestamp & binary partition types #1180 (marijncv)
- feat: extend configuration handling #1206 (marijncv)
- fix: load command for local tables #1205 (roeap)
- Enable passing Datafusion session state to WriteBuilder #1187 (gruuya)
- chore: increment dynamodb_lock version #1202 (wjones127)
- fix: update out-of-date doc about datafusion #1183 (xudong963)
- feat: move and update Optimize operation #1154 (roeap)
- add test for extract_partition_values #1159 (marijncv)
- fix typo #1166 (spebern)
- chore: remove star dependencies #1139 (wjones127)
rust-v0.7.0 (2023-02-11)
Implemented enhancements:
- Support FSCK REPAIR TABLE Operation #1092
- Expose the Delta Log in a DataFrame that's easy for analysis #1031
- Provide case-insensitive storage options in backend #999
- Support local file path in CreateBuilder::with_location() #998
- Save operational params in the same way with delta io #1054 (ismoshkov)
Fixed bugs:
- DeltaTable DataFusion TableProvider does not support filter pushdown #1064
- DeltaTable DataFusion scan does not prune files properly #1063
- deltalake.DeltaTable constructor hangs in Jupyter #1093
- Transaction log JSON formatting issue when writing data via Python bindings #1017
- crates.io entry is missing link to rustdoc documentation #1076
- URL Registered with ObjectStore registry is different from url in DeltaScan #1018
- Not able to connect to Azure Storage with client id/secret #977
- Deltalake 0.5 crate s3 feature dynamodb version mismatch #973
- Overwrite mode does not work with Azure #939
- Use Chrono without default features #914
cargo test
does not run due to tls conflict #985- Azure SAS authorization fails with
<AuthenticationErrorDetail>Signature fields not well formed.
#910
Merged pull requests:
- Make rustls default across all packages #1097 (wjones127)
- Implement filesystem check #1103 (Blajda)
- refactor: move vacuum command to operations module #1045 (roeap)
- feat: enable passing storage options to Delta table builder via DataFusion's CREATE EXTERNAL TABLE #1043 (gruuya)
- feat: improve storage location handling #1065 (roeap)
- Fix to support UTC timezone #1022 (andrei-ionescu)
- feat: harmonize and simplify storage configuration #1052 (roeap)
- feat: expose function to get table of add actions #1033 (wjones127)
- fix: change unexpected field logging level to debug #1112 (houqp)
- fix: datafusion predicate pushdown and dependencies #1071 (roeap)
- fix: azure sas key url encoding #1036 (roeap)
- Add provisional workaround to support CDC #1039 #1042 (Fazzani)
- improve debuggability of json ser/de errors #1119 (houqp)
- Add an example of writing to a delta table with a RecordBatch #1085 (rtyler)
- minor: optimize partition lookup for vacuum loop #1120 (houqp)
- Add missing documentation metadata to Cargo.toml #1077 (johnbatty)
- add test for null_count_schema_for_fields #1135 (marijncv)
- add test for min_max_schema_for_fields #1122 (marijncv)
- add test for get_boolean_from_metadata #1121 (marijncv)
- add test for left_larger_than_right #1110 (marijncv)
- Add test for: to_scalar_value #1086 (marijncv)
- Fix typo in delta-inspect #1072 (byteink)
- chore: update datafusion #1114 (roeap)
rust-v0.6.0 (2022-12-16)
Implemented enhancements:
- Support Apache Arrow DataFusion 15 #1020
- Python package: Loosen version requirements for maturin #1004
- Remove
Cargo.lock
from library crates and addCargo.lock
to binary ones #1000 - More frequent Rust releases #969
- Thoughts on adding read_delta to pandas #869
- Add the support of the AWS_PROFILE environment variable for S3 #986 (fvaleye)
Fixed bugs:
- Azure SAS signatures ending in "=" don't work #1003
- Fail to compile deltalake crate, need to update dynamodb_lock in crates.io #1002
- error reading delta table to pandas: runtime dropped the dispatch task #975
- MacOS arm64 wheels are generated incorrectly #972
- Overwrite creates new file #960
- The written delta file has corrupted structure #956
- Write mode doesn't work with Azure storage #955
- Python: We don't error on reader protocol v2 #886
- Cannot open a deltatable in S3 using AWS_PROFILE based credentials from a local machine #855
Merged pull requests:
- Support DataFusion 15 #1021 (andrei-ionescu)
- fix truncating signature on SAS #1007 (damiondoesthings)
- Loosen version requirement for maturin #1005 (gyscos)
- Update
.gitignore
and add/removeCargo.lock
when appropriate #1001 (iajoiner) - fix: get azure client secret from config #981 (roeap)
- feat: check invariants in write command #980 (roeap)
- Add a new release github action for Python binding: macos with universal2 wheel #976 (fvaleye)
- Bump version of the Python binding to 0.6.4 #970 (fvaleye)
- Handle pandas timestamps #958 (hayesgb)
- test(python): add azure integration tests #912 (wjones127)
* This Changelog was automatically generated by github_changelog_generator