Releases: delta-io/delta-rs
Releases · delta-io/delta-rs
rust-v0.5.0
What's Changed
- Add max and min values to Statistics by @viirya in #327
- Use WebIdentityProvider for DynamoDb client in k8s by @rusty-jules in #328
- bump rust version in preparation for the next release by @houqp in #329
- fix automated rust release CD job by @houqp in #326
- add pandas keyword to python package metadata by @houqp in #325
- expose update_incremental API to python binding by @houqp in #332
- update python related docs by @houqp in #331
- Upgrade arrow, parquet and datafusion by @Dandandan in #335
- added warning message if the detected glibc version is < 2.28 by @Smurphy000 in #334
- Convert scalar value to correct type based on arrow data type. by @viirya in #336
- Fix consecutive checkpoints by @mosyp in #333
- Fix new clippy warnings coming up in CI by @xianwill in #341
- perform incremental update after transaction commit by @houqp in #343
- Add timestamp handling to checkpoint writer by @xianwill in #340
- Add clear table state in load_version when no checkpoint found. by @zijie0 in #347
- Low level create table by @Smurphy000 in #342
- pub DeltaTable method to retrieve table configurations by @Smurphy000 in #356
- Modify partition_values field type in Add/Remove actions. by @zijie0 in #354
- fix sleep workaround in checkpoint test by @houqp in #360
- Modify get_files_by_partitions to use partition values by @zijie0 in #362
- Fix get_latest_version returning version < 0. by @zijie0 in #364
- fix typo in python release CI config by @houqp in #365
- cache cargo builds by @houqp in #359
- Add '.tmp' suffix to temporary file of prepared commit by @mosyp in #366
- support partition value string deserialization for float/double/date by @houqp in #363
- Implement atomic put_obj. by @zijie0 in #367
- Make Format.options to be required field by @mosyp in #370
- Allow filesystem backend put_obj to overwrite existing by @mosyp in #376
- Wrap DeltaTransactionError with DeltaTableError. by @zijie0 in #374
- Refactoring of black, isort, mypy tools usages into pyproject.toml by @fvaleye in #378
- Implement consistent behavior in Windows with regard to swap parameter. by @zijie0 in #379
- Merge Cargo.toml into pyproject.toml by @fvaleye in #381
- Update datafusion and ballista links in README by @ei-grad in #382
- Add sts assume role credentials provider for S3 by @mosyp in #383
- Reuse table/storage instances in checkpoints by @mosyp in #384
- additional error handling to atomic_rename by @Marnixvdb in #386
- Upgrade to DataFusion 5.0 by @Dandandan in #389
- added initial commit info on create method for a DeltaTable by @Smurphy000 in #387
- Google cloud by @blogle in #355
- Remove version param from create_checkpoint_from_table by @mosyp in #399
- Implement delete_objs in fs and s3 storage backends. by @zijie0 in #395
- Add examples for reading delta table with Rust API. by @zijie0 in #400
- Update pyproject definition in pyproject.toml by @fvaleye in #405
- Use
tokio::fs::rename
input_obj
. by @zijie0 in #403 - Fix duplicates on update call by @mosyp in #398
- Add a Makefile build task in the Python binding by @fvaleye in #410
- Add implementation for
load_with_datetime
in Python package. by @zijie0 in #411 - Add filesystem argument for reading DeltaTable in Python binding by @fvaleye in #414
- Fix reading nullable action fields from parquet by @mosyp in #417
- Ensures that all table schemas are of StructType by @blogle in #415
- Gcs writer bugs by @blogle in #412
- Add S3StorageOptions to allow configuring S3 backend explicitly by @xianwill in #418
- Read a DeltaTable using a Data Catalog by @fvaleye in #419
- Change checkpoint creation logs from info to debug by @mosyp in #423
- Add LICENSE file in the Python binding and refer it in the pyproject by @fvaleye in #422
- Audit action field optionality by @fvaleye in #380
- Introduce DeltaConfig and tombstones retention policy by @mosyp in #420
- [README] Replace the inactive rust-dataframe with polars by @sa- in #426
- Bump arrow to 6.0.0-SNAPSHOT and bring map support to schema by @mosyp in #375
- Support partition value string deserialization for timestamp/binary by @zijie0 in #371
- Document the valid primitive types by @Ekleog in #430
- Add is_non_acquirable field to the dynamodb lock by @mosyp in #429
- Clean up DeltaTransactionError by @mosyp in #432
- Optimize remove action apply with early iteration exit #424 by @akshay26031996 in #431
- Decode path in Add and Remove actions. by @zijie0 in #434
- reenable datafusion integration with temporary fork by @houqp in #436
- Add history command in delta-rs by @fvaleye in #428
- Release Python binding version 0.5.3 by @fvaleye in #439
- Add delete_lock and fix release_lock by @mosyp in #440
- Fixing test to compare sorted vec by @akshay26031996 in #443
- Batch-apply remove actions in tombstone handling by @dispanser in #444
- Update datafusion links by @bbigras in #446
- Run all tests under s3 feature flag by @mosyp in #447
- Add maturin develop command with extras in Python binding by @fvaleye in #448
- README: mark Checkpoint creation as done for Rust by @bbigras in #449
- Fix broken tombstones metadata when extended_file_metadata is different between tomstones in state by @mosyp in #450
- No tombstone loading by @dispanser in #445
- return lazy iterator in get tombstone methods by @houqp in #452
- Generate new session name on assume role credentials provider refresh by @mosyp in #451
- Add pool_idle_timeout options for s3 and sts clients by @mosyp in #458
- Do action reconciliation by @viirya in #456
- Use action default stats by @viirya in #459
- Add new module for DeltaTableState by @viirya in #464
- Support hash lookup by path string for Remove action by @viirya in #462
- Fix coverage of the Python tests by @fvaleye in #467
- materialize tables in pyhton via native storage backend by @roeap in #463
- Make file storage backend's atomic rename async by @viirya in #471
- Add GCS feature to the Python Cargo.toml file by @kelvins in #476
- Throw an error when filter key is not in partitioned columns. by @zijie0 in #475
- Fix documentation for the DeltaStorageHandler by @fvaleye in #483
- Update README.adoc by @dennyglee in #482
- Update az...
python-v0.6.4
What's Changed
- fix(python): enforce reader protocol version by @wjones127 in #932
- feat: rewrite operations by @roeap in #852
- chore: bump datafusion and arrow by @roeap in #940
- Allow for reading columns as dictionaries using to_pyarrow_dataset by @Kuhlwein in #941
- Add TableProviderFactory and test for SQL to register tables dynamically at runtime by @avantgardnerio in #892
- feat: improve write perfromance of
DeltaFileSystemHandler
by @roeap in #943 - Update CONTRIBUTING.md by @wjones127 in #944
- build(deps): bump serde_json from 1.0.87 to 1.0.88 by @dependabot in #947
- build(deps): bump reqwest from 0.11.12 to 0.11.13 by @dependabot in #946
- docs: add simple operations example by @roeap in #953
- bump rust core version to 0.5.0 by @houqp in #961
- pin glic_version version in dep to unblock creates.io release by @houqp in #967
- remove all wildcard version pin in Cargo.toml by @houqp in #968
- Bump version of the Python binding to 0.6.4 by @fvaleye in #970
New Contributors
- @Kuhlwein made their first contribution in #941
- @avantgardnerio made their first contribution in #892
Full Changelog: python-v0.6.3...python-v0.6.4
python-v0.6.3
What's Changed
- Added without_files flag to DeltaTable constructor by @MykhailoHevak in #866
- chore: update object-store dependency by @wjones127 in #884
- build(deps): bump serde from 1.0.145 to 1.0.147 by @dependabot in #895
- build(deps): bump anyhow from 1.0.65 to 1.0.66 by @dependabot in #897
- build(deps): bump serde_json from 1.0.86 to 1.0.87 by @dependabot in #896
- build(deps): bump futures from 0.3.24 to 0.3.25 by @dependabot in #899
- build(deps): bump async-trait from 0.1.57 to 0.1.58 by @dependabot in #898
- build(deps): bump once_cell from 1.15.0 to 1.16.0 by @dependabot in #907
- build(deps): bump libc from 0.2.135 to 0.2.137 by @dependabot in #905
- build(deps): bump lambda_runtime from 0.6.1 to 0.7.0 by @dependabot in #903
- Fix parsing struct stats after schema evolution by @Tom-Newton in #901
- fix: pass storage options down when getting delta table by @wjones127 in #893
- Fix cargo clippy issues 0.1.65 in Rust by @fvaleye in #923
New Contributors
- @MykhailoHevak made their first contribution in #866
Full Changelog: python-v0.6.2...python-v0.6.3
What's Changed
- Added without_files flag to DeltaTable constructor by @MykhailoHevak in #866
- chore: update object-store dependency by @wjones127 in #884
- build(deps): bump serde from 1.0.145 to 1.0.147 by @dependabot in #895
- build(deps): bump anyhow from 1.0.65 to 1.0.66 by @dependabot in #897
- build(deps): bump serde_json from 1.0.86 to 1.0.87 by @dependabot in #896
- build(deps): bump futures from 0.3.24 to 0.3.25 by @dependabot in #899
- build(deps): bump async-trait from 0.1.57 to 0.1.58 by @dependabot in #898
- build(deps): bump once_cell from 1.15.0 to 1.16.0 by @dependabot in #907
- build(deps): bump libc from 0.2.135 to 0.2.137 by @dependabot in #905
- build(deps): bump lambda_runtime from 0.6.1 to 0.7.0 by @dependabot in #903
- Fix parsing struct stats after schema evolution by @Tom-Newton in #901
- fix: pass storage options down when getting delta table by @wjones127 in #893
- Fix cargo clippy issues 0.1.65 in Rust by @fvaleye in #923
- Bump version of the Python binding to 0.6.3 by @fvaleye in #924
New Contributors
- @MykhailoHevak made their first contribution in #866
Full Changelog: python-v0.6.2...python-v0.6.3
python-v0.6.2
What's Changed
- build(deps): bump url from 2.2.2 to 2.3.0 by @dependabot in #800
- build(deps): bump lambda_runtime from 0.6.0 to 0.6.1 by @dependabot in #801
- build(deps): bump parquet2 from 0.16.2 to 0.16.3 by @dependabot in #802
- build(deps): bump criterion from 0.3.6 to 0.4.0 by @dependabot in #803
- build(deps): bump percent-encoding from 2.1.0 to 2.2.0 by @dependabot in #804
- build(deps): bump url from 2.3.0 to 2.3.1 by @dependabot in #805
- feat: integrate
object_store
for read/write with pyarrow by @roeap in #799 - cleanup errors and make
DeltaWriterError
internal by @roeap in #784 - Refactoring of the Github Action python release by @fvaleye in #810
- Re-allow writing to non-existant local paths by @wjones127 in #811
- [Python][Docs] Add description to landing page by @wjones127 in #817
- [Python] Fix handling of null stats in write_deltalake by @wjones127 in #815
- feat(python): Set smaller defaults on row group and file size by @wjones127 in #818
- chore: bump arrow and friends by @roeap in #814
- feat: improve storage configuration by @roeap in #822
- Datafusion-imports by @roeap in #823
- build(deps): bump tokio-stream from 0.1.9 to 0.1.10 by @dependabot in #824
- build(deps): bump anyhow from 1.0.64 to 1.0.65 by @dependabot in #825
- build(deps): bump once_cell from 1.13.1 to 1.14.0 by @dependabot in #826
- build(deps): bump thiserror from 1.0.34 to 1.0.35 by @dependabot in #827
- build(deps): bump env_logger from 0.7.1 to 0.9.1 by @dependabot in #829
- build(deps): bump tokio from 1.21.0 to 1.21.1 by @dependabot in #828
- Fix warnings by @wjones127 in #839
- comment cleanup by @houqp in #840
- add codeowners by @houqp in #841
- build(deps): bump thiserror from 1.0.35 to 1.0.36 by @dependabot in #843
- build(deps): bump libc from 0.2.132 to 0.2.133 by @dependabot in #844
- build(deps): bump serde from 1.0.144 to 1.0.145 by @dependabot in #846
- build(deps): bump reqwest from 0.11.11 to 0.11.12 by @dependabot in #847
- build(deps): bump once_cell from 1.14.0 to 1.15.0 by @dependabot in #849
- use rustls for delta checkpoint lambda by @houqp in #842
- Add invariant enforcement support by @wjones127 in #834
- Add contributing page with roadmap and good first issues link by @wjones127 in #853
- build(deps): bump tokio from 1.21.1 to 1.21.2 by @dependabot in #856
- build(deps): bump libc from 0.2.133 to 0.2.134 by @dependabot in #857
- build(deps): bump openssl from 0.10.41 to 0.10.42 by @dependabot in #858
- build(deps): bump thiserror from 1.0.36 to 1.0.37 by @dependabot in #859
- build(deps): bump uuid from 1.1.2 to 1.2.1 by @dependabot in #872
- build(deps): bump serde_json from 1.0.85 to 1.0.86 by @dependabot in #874
- build(deps): bump pyo3 from 0.17.1 to 0.17.2 by @dependabot in #873
- Bump Python binding version to 0.6.2 by @fvaleye in #876
- Fix the Python Release Github Action by @fvaleye in #877
Full Changelog: python-v0.6.1...python-v0.6.2
python-v0.6.1
What's Changed
- feat: add gcs integration tests by @roeap in #779
- build(deps): bump lz4-sys from 1.9.2 to 1.9.4 in /aws/delta-checkpoint by @dependabot in #782
- build(deps): bump lz4-sys from 1.9.2 to 1.9.4 in /delta-inspect by @dependabot in #783
- build(deps): bump tokio from 1.20.1 to 1.21.0 by @dependabot in #790
- build(deps): bump thiserror from 1.0.32 to 1.0.34 by @dependabot in #792
- build(deps): bump pretty_assertions from 1.2.1 to 1.3.0 by @dependabot in #791
- build(deps): bump anyhow from 1.0.62 to 1.0.64 by @dependabot in #793
- build(deps): bump env_logger from 0.7.1 to 0.9.0 by @dependabot in #794
- hotfix: python object store paths by @roeap in #787
- prepare python release
0.6.1
by @roeap in #795
Full Changelog: python-v0.6.0...python-v0.6.1
python-v0.6.0
What's Changed
- JSON Writer writing partitions values fix by @Blajda in #658
- Support date32 and decimal stats in write_deltalake by @wjones127 in #659
- bugfix: Make sure vacuum works on relative paths by @wjones127 in #664
- Fix linting / build on main by @mrk-its in #670
- feat: add support for HTTPS_PROXY env var by @xfrancois in #665
- Utilise struct stats when available by @Tom-Newton in #656
- fix: inconsistent path in azure list by @roeap in #673
- Factor vacuum and implement a builder by @Blajda in #672
- Bump openssl-src from 111.21.0+1.1.1p to 111.22.0+1.1.1q by @dependabot in #674
- Bump openssl-src from 111.20.0+1.1.1o to 111.22.0+1.1.1q in /aws/delta-checkpoint by @dependabot in #675
- fix: get docs building and add to CI checks by @wjones127 in #679
- fix: omit common prefixes in azure
list_objs
by @roeap in #683 - fix: traverse directories in local
list_objs
by @roeap in #681 - Implement vacuum tests and general test setup utils by @Blajda in #682
- Setup Github dependabot for Rust by @fvaleye in #687
- Bump regex from 1.5.6 to 1.6.0 by @dependabot in #694
- Bump serde_json from 1.0.81 to 1.0.82 by @dependabot in #692
- Bump serial_test from 0.7.0 to 0.8.0 by @dependabot in #699
- Bump hyper from 0.14.19 to 0.14.20 by @dependabot in #702
- feat: sharable reference to storage backend by @roeap in #697
- Bump openssl from 0.10.40 to 0.10.41 by @dependabot in #700
- Get file size from Pyarrow directly (>= 9.0.0) by @Bernolt in #704
- Bump bytes from 1.1.0 to 1.2.0 by @dependabot in #707
- Bump serde from 1.0.137 to 1.0.140 by @dependabot in #708
- Bump crossbeam from 0.8.1 to 0.8.2 by @dependabot in #709
- Bump tokio from 1.19.2 to 1.20.0 by @dependabot in #710
- Fix usage documentation in Python binding by @fvaleye in #716
- Bump bytes from 1.2.0 to 1.2.1 by @dependabot in #719
- Bump tokio from 1.20.0 to 1.20.1 by @dependabot in #720
- Bump lambda_runtime from 0.3.0 to 0.6.0 by @dependabot in #711
- feat: integrate with object_store / datafusion APIs by @roeap in #703
- Bump async-trait from 0.1.56 to 0.1.57 by @dependabot in #730
- Bump serde from 1.0.140 to 1.0.142 by @dependabot in #726
- Bump anyhow from 1.0.58 to 1.0.60 by @dependabot in #727
- Bump libc from 0.2.126 to 0.2.127 by @dependabot in #728
- Bump serde_json from 1.0.82 to 1.0.83 by @dependabot in #731
- Bump thiserror from 1.0.31 to 1.0.32 by @dependabot in #732
- Prune scanned files on column stats by @roeap in #724
- Fix parsing null counts for struct type columns in the struct stats by @Tom-Newton in #714
- Python: fix: fix minimal test and bump minimum pyarrow version by @wjones127 in #733
- Implement Python Schema in Rust by @wjones127 in #684
- fix: Address clippy lint warnings. by @tsh56 in #742
- Bump serde from 1.0.142 to 1.0.143 by @dependabot in #737
- Bump libc from 0.2.127 to 0.2.132 by @dependabot in #743
- build(deps): bump anyhow from 1.0.60 to 1.0.62 by @dependabot in #744
- build(deps): bump serial_test from 0.8.0 to 0.9.0 by @dependabot in #745
- build(deps): bump chrono from 0.4.20 to 0.4.22 by @dependabot in #748
- build(deps): bump futures from 0.3.21 to 0.3.23 by @dependabot in #747
- Cast min and max too when parsing stats by @wjones127 in #753
- build(deps): bump serde_json from 1.0.83 to 1.0.85 by @dependabot in #759
- build(deps): bump serde from 1.0.143 to 1.0.144 by @dependabot in #760
- turn table state version into a private field by @houqp in #772
- build(deps): bump pyo3 from 0.16.5 to 0.16.6 by @dependabot in #773
- Remove Python version 3.6 support And run multiple python versions by @fvaleye in #770
- parquet2 implementation backed by parquet2 feature gate by @houqp in #465
- Adopt
ObjectStore
by @roeap in #761 - chore: cleanup by @roeap in #774
- Bump version of the Python binding to 0.6.0 by @fvaleye in #762
New Contributors
- @mrk-its made their first contribution in #670
- @xfrancois made their first contribution in #665
- @Bernolt made their first contribution in #704
- @tsh56 made their first contribution in #742
Full Changelog: python-v0.5.8...python-v0.6.0
python-v0.5.8
What's Changed
- Expose read and write options in public API by @george-zubrienko in #581
- [proof] make sure lock at least expires once by @houqp in #591
- Python API - delta.appendOnly enforcement by @WarSame in #590
- Avoid building pandas and numpy from source by @wjones127 in #595
- Introduce require_files for tracking the add files in table state by @mosyp in #594
- Make sure pandas is optional by @wjones127 in #597
- High level Delta Operations with Datafusion by @roeap in #584
- Re-enable datafusion tests and improve supported types. by @roeap in #601
- default to root for empty path in azure store by @roeap in #603
- publish dynamodb_lock to crates.io by @houqp in #605
- Configure Azure storage using a map (#555) by @Blajda in #598
- Azure options by @roeap in #606
- Update rusoto dependencies to 0.48 by @ahmedriza in #611
- upgrade to datafusion 8 by @houqp in #612
- fix: cap sphinx version to avoid bug in 5.0 by @wjones127 in #615
- Provide Python aarch64 wheels for Linux. by @fvaleye in #613
- Refactoring of the Python release Github action by @fvaleye in #616
- fix: Use relative paths for add paths by @wjones127 in #618
- Bin packing optimization by @Blajda in #607
- feat: impl rename_noreplace with std::fs::hard_link by default by @wjones127 in #621
- feat(python): validate schema in write_deltalake by @wjones127 in #624
- Fix the AWS_REGION environment variable configuration in S3 backend by @fvaleye in #633
- Refactor azure storage with crate updates by @roeap in #644
- Defer creation of storage backend in DeltaTableBuilder by @Blajda in #639
- fix: Add correct size and null paritition values to add actions by @wjones127 in #625
- Bump flatbuffers from 0.8.4 to 2.1.2 in /aws/delta-checkpoint by @dependabot in #626
- Bump hyper from 0.14.9 to 0.14.19 in /aws/delta-checkpoint by @dependabot in #628
- Bump regex from 1.5.4 to 1.5.5 in /aws/delta-checkpoint by @dependabot in #629
- Bump regex from 1.5.4 to 1.5.6 in /delta-inspect by @dependabot in #630
- Bump thread_local from 1.1.3 to 1.1.4 in /aws/delta-checkpoint by @dependabot in #646
- fix: Prevent warning spam when reading tables generated by delta 1.2.1 by @Tom-Newton in #651
- refactor: move version field to
DeltaTableState
by @roeap in #649 - feat: add enforce_retention_duration param to vacuum method by @houqp in #648
- fix: read vacuumed delta log without _last_checkpoint by @roeap in #643
- feat: Upgrade to arrow/parquet 15 and datafusion 9 by @xianwill in #652
- Release of the Python binding version 0.5.8 by @fvaleye in #640
New Contributors
- @george-zubrienko made their first contribution in #581
- @WarSame made their first contribution in #590
- @dependabot made their first contribution in #626
- @Tom-Newton made their first contribution in #651
Full Changelog: python-v0.5.7...python-v0.5.8
python-v0.5.7
What's Changed
- Upgrade DataFusion, Arrow, Parquet dependencies by @Dandandan in #562
- fix clippy warnings by @houqp in #567
- Azure improvements by @thovoll in #556
- Update ADLSGen2-HOWTO.md by @dgcaron in #560
- Parse partition values before handing to PyArrow by @wjones127 in #565
- [Python] Test in minimal and latest Python environments by @wjones127 in #572
- [Python] Initial PyArrow writer by @wjones127 in #566
- Upgrade
arrow, parquet, datafusion
version by @zemelLeong in #583 - Record Batch Writer by @roeap in #573
- Replace table location prefix from s3a to s3 by @novakov-alexey in #585
- Allow metadata for write_deltalake by @PadenZach in #587
- Change private time_utils module to public. by @xianwill in #586
- Release of the Python binding version 0.5.7 by @fvaleye in #589
New Contributors
- @dgcaron made their first contribution in #560
- @zemelLeong made their first contribution in #583
- @novakov-alexey made their first contribution in #585
- @PadenZach made their first contribution in #587
Full Changelog: python-v0.5.6...python-v0.5.7
python-v0.5.6
- Bump version of Python binding to 0.5.6 (#558)
- Move delta-inspect to its own crate (#557)
- Fix VACUUM by using table_uri when filtering files to delete (#551)
- Formally verify S3 atomic rename (#540)
- Implement missing Azure storage backend methods (#499)
- Implement polling for table updates (#550)
- Add target in Python release Github action workflow. (#548)
Credits:
QP Hou, Thomas Vollmer, David Blajda, Florian Valeye
Full Changelog: python-v0.5.5...python-v0.5.6
python-v0.5.5
- Add storage options for backends (#544)
- Remove coupling of DynamoDbLockClient from S3 storage (#535)
- add macOS 11 support in python binding release (#541)
- Refresh Python usage documentation (#539)
- [Python] Create PyArrow dataset fragments from delta log (#525)
- Fix Delta metadata transaction schema (#531)
- Add gcs test and improve credential error (#533)
- Return complete history (#526)
- Move dynamodb lock into its own crate (#508)
- Add datafusion examples to docs (#519)
- Fix S3 list_objs and cleanup_metadata (#518)
- Add support for creating List and Map schema types (#517)
- Update datafusion version to 6 (#516)
- Retry S3 get request on 500 Internal Server Error (#510)
- Fix memory overhead when creating checkpoint (#502)
- Fix nullable partition values (#498)
- Fix cleanup_expired_logs timestamp (#503)
- Add bool config enableExpiredLogCleanup. (#500)
- pin arrow to major version (#501)
Credits:
Florian Valeye, ahmedriza, Will Jones, Liang-Chi Hsieh, Gabriel J. Michael, Matthew Turner, Mykhailo Osypov, Andrei Ionescu, QP Hou
Full Changelog: python-v0.5.4...python-v0.5.5