Skip to content

Releases: delta-io/delta-rs

python-v0.13.0: Repair operation and PyArrow 13+ support

06 Nov 00:13
a5e2e3b
Compare
Choose a tag to compare

New features

Bug fixes

Other changes

New Contributors

Full Changelog: python-v0.12.0...python-v0.13.0

python-v0.12.0: Delete, Update, and Merge

19 Oct 01:00
3bcc428
Compare
Choose a tag to compare

What's Changed

New features

Bug fixes

  • fix: exception string in writer.py by @sebdiem in #1665
  • fix: change partitioning schema from large to normal string for pyarrow<12 by @ion-elgreco in #1671
  • fix: use epoch instead of ce for date stats by @universalmind303 in #1672
  • fix: unify environment variables referenced by Databricks docs by @rtyler in #1673
  • fix!: ensure predicates are parsable by @Blajda in #1690
  • fix: merge operation with string predicates by @Blajda in #1705
  • fix: reorder encode_partition_value() checks and add tests by @ldacey in #1733

Other contributions

Breaking changes

The DeltaTable.history() method now returns transactions in reverse chronological order. This matches the Spark implementation.

DeltaTable.files_by_partitions() has been removed. It has been deprecated since 0.7.0. Use DeltaTable.file_uris() instead.

DeltaTable.pyarrow_schema() has been removed. it has been deprecated since 0.7.0. Use DeltaTable.schema().to_pyarrow() instead.

New Contributors

Full Changelog: python-v0.11.0...python-v0.12.0

rust-v0.16.0

27 Sep 19:14
55a309d
Compare
Choose a tag to compare

Full Changelog

Implemented enhancements:

  • Expose Optimize option min_commit_interval in Python #1640
  • Expose create_checkpoint_for #1513
  • integration tests regularly fail for HDFS #1428
  • Add Support for Microsoft OneLake #1418
  • add support for atomic rename in R2 #1356

Fixed bugs:

  • Writing with large arrow types (e.g. large_utf8), writes wrong partition encoding #1669
  • [python] Different stringification of partition values in reader and writer #1653
  • Unable to interface with data written from Spark Databricks #1651
  • get_last_checkpoint does some unnecessary listing #1643
  • PartitionWriter's buffer_len doesn't include incomplete row groups #1637
  • Slack community invite link has expired #1636
  • delta-rs does not appear to support tables with liquid clustering #1626
  • Internal Parquet panic when using a Map type. #1619
  • partition_by with "$" on local filesystem #1591
  • ProtocolChanged error when perfoming append write #1585
  • Unable to cargo update using git tag or rev on Rust 1.70 #1580
  • NoMetadata error when reading detlatable #1562
  • Cannot read delta table: Delta protocol violation #1557
  • Update the CODEOWNERS to capture the current reviewers and contributors #1553
  • [Python] Incorrect file URIs when partition values contain escape character #1533
  • add documentation how to Query Delta natively from datafusion #1485
  • Python: write_deltalake to ADLS Gen2 issue #1456
  • Partition values that have been url encoded cannot be read when using deltalake #1446
  • Error optimizing large table #1419
  • Cannot read partitions with special characters (including space) with pyarrow >= 11 #1393
  • ImportError: deltalake/_internal.abi3.so: cannot allocate memory in static TLS block #1380
  • Invalid JSON in log record missing field schemaString for DLT tables #1302
  • Special characters in partition path not handled locally #1299

Merged pull requests:

  • chore: bump rust crate version #1675 (rtyler)
  • fix: change partitioning schema from large to normal string for pyarrow<12 #1671 (ion-elgreco)
  • feat: allow to set large dtypes for the schema check in write_deltalake #1668 (ion-elgreco)
  • docs: small consistency update in guide and readme #1666 (ion-elgreco)
  • fix: exception string in writer.py #1665 (sebdiem)
  • chore: increment python library version #1664 (wjones127)
  • docs: fix some typos #1662 (ion-elgreco)
  • fix: more consistent handling of partition values and file paths #1661 (roeap)
  • docs: add docstring to protocol method #1660 (MrPowers)
  • docs: make docs.rs build docs with all features enabled #1658 (simonvandel)
  • fix: enable offset listing for s3 #1654 (eeroel)
  • chore: fix the incorrect Slack link in our readme #1649 (rtyler)
  • fix: compensate for invalid log files created by Delta Live Tables #1647 (rtyler)
  • chore: proposed updated CODEOWNERS to allow better review notifications #1646 (rtyler)
  • feat: expose min_commit_interval to optimize.compact and optimize.z_order #1645 (ion-elgreco)
  • fix: avoid excess listing of log files #1644 (eeroel)
  • fix: introduce support for Microsoft OneLake #1642 (rtyler)
  • fix: explicitly require chrono 0.4.31 or greater #1641 (rtyler)
  • fix: include in-progress row group when calculating in-memory buffer length #1638 (BnMcG)
  • chore: relax chrono pin to 0.4 #1635 (houqp)
  • chore: update datafusion to 31, arrow to 46 and object_store to 0.7 #1634 (houqp)
  • docs: update Readme #1633 (dennyglee)
  • chore: pin the chrono dependency #1631 (rtyler)
  • feat: pass known file sizes to filesystem in Python #1630 (eeroel)
  • feat: implement parsing for the new domainMetadata actions in the commit log #1629 (rtyler)
  • ci: fix python release #1624 (wjones127)
  • ci: extend azure timeout #1622 (wjones127)
  • feat: allow multiple incremental commits in optimize #1621 (kvap)
  • fix: change map nullable value to false #1620 (cmackenzie1)
  • Introduce the changelog for the last couple releases #1617 (rtyler)
  • chore: bump python version to 0.10.2 #1616 (wjones127)
  • perf: avoid holding GIL in DeltaFileSystemHandler #1615 (wjones127)
  • fix: don't re-encode paths #1613 (wjones127)
  • feat: use url parsing from object store #1592 (roeap)
  • feat: buffered reading of transaction logs #1549 (eeroel)
  • feat: merge operation #1522 (Blajda)
  • feat: expose create_checkpoint_for to the public #1514 (haruband)
  • docs: update Readme #1440 (roeap)
  • refactor: re-organize top level modules #1434 (roeap)
  • feat: integrate unity catalog with datafusion #1338 (roeap)

python-v0.11.0

26 Sep 16:10
b447934
Compare
Choose a tag to compare

What's Changed

New Features

  • feat: expose min_commit_interval to optimize.compact and optimize.z_order by @ion-elgreco in #1645
  • feat: allow multiple incremental commits in optimize by @kvap in #1621
  • feat: introduce support for Microsoft OneLake by @rtyler in #1642

Performance Improvements

  • feat: pass known file sizes to filesystem in Python by @eeroel in #1630
  • fix: avoid excess listing of log files by @eeroel in #1644
  • fix: enable offset listing for s3 by @eeroel in #1654

Other

  • chore: update datafusion to 31, arrow to 46 and object_store to 0.7 by @houqp in #1634
  • feat: implement parsing for the new domainMetadata actions in the commit log by @rtyler in #1629
  • feat: integrate unity catalog with datafusion by @roeap in #1338
  • fix: compensate for invalid log files created by Delta Live Tables by @rtyler in #1647
  • docs: add docstring to protocol method by @MrPowers in #1660
  • docs: fix some typos by @ion-elgreco in #1662
  • feat: use url parsing from object store by @roeap in #1592
  • chore: proposed updated CODEOWNERS to allow better review notifications by @rtyler in #1646
  • fix: more consistent handling of partition values and file paths by @roeap in #1661

New Contributors

Full Changelog: python-v0.10.2...python-v0.11.0

python-v0.10.2

11 Sep 05:24
9d1857d
Compare
Choose a tag to compare

What's Changed

New features

  • feat: add restore command in python binding by @loleek in #1529
  • feat: buffered reading of transaction logs by @eeroel in #1549

Bug fixes

Other

New Contributors

Full Changelog: python-v0.10.1...python-v0.10.2

rust-v0.14.0

05 Aug 20:23
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: rust-v0.13.0...rust-v0.14.0

python-v0.10.1

27 Jul 16:20
012ca7f
Compare
Choose a tag to compare

What's Changed

New features

  • feat: handle larger z-order jobs with streaming output and spilling by @wjones127 in #1461
  • feat: implement restore operation by @loleek in #1502
  • feat!: bulk delete for vacuum by @Blajda in #1556

Fixes

Other

New Contributors

Full Changelog: python-v0.10.0...python-v0.10.1

rust-v0.13.0

15 Jul 06:59
Compare
Choose a tag to compare

Full Changelog

Implemented enhancements:

  • Add nested struct supports #1518
  • Support FixedLenByteArray UUID statistics as a logical scalar #1483
  • Exposing create_add in the API #1458
  • Update features table on README #1404
  • docs(python): show data catalog options in Python API reference #1347
  • Add optimization to only list log files starting at a certain name #1252
  • Support configuring parquet compression #1235
  • parallel processing in Optimize command #1171

Fixed bugs:

  • get_add_actions() MAX is not showing complete value #1534
  • Can't get stats's minValues in add actions #1515
  • Pyarrow is_null filter not working as expected after loading using deltalake #1496
  • Can't write to table that uses generated columns #1495
  • Json error: Binary is not supported by JSON when writing checkpoint files #1493
  • _last_checkpoint size field is incorrect #1468
  • Error when Z Ordering a larger dataset #1459
  • Timestamp parsing issue #1455
  • File options are ignored when writing delta #1444
  • Slack Invite Link No Longer Valid #1425
  • cleanup_metadata doesn't remove .checkpoint.parquet files #1420
  • The test of reading the data from the blob storage located in Azurite container failed #1415
  • The test of reading the data from the bucket located in Minio container failed #1408
  • Datafusion: unreachable code reached when parsing statistics with missing columns #1374
  • vacuum is very slow on Cloudflare R2 #1366

Closed issues:

  • Expose Compression Options or WriterProperties for writing to Delta #1469
  • Support out-of-core Z-order using DataFusion #1460
  • Expose Z-order in Python #1442

Merged pull requests:

python-v0.10.0: Z-order, faster optimize and vacuum

09 Jun 18:54
155ca4c
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: python-v0.9.0...python-v0.10.0

rust-v0.12.0

31 May 00:42
df98587
Compare
Choose a tag to compare
Boy howdy there's some great looking performance improvements in this…