52.2.0 (2024-07-24)
Implemented enhancements:
- Faster min/max for string/binary view arrays #6088 [arrow]
- Support casting to/from Utf8View #6076 [arrow]
- Min/max support for String/BinaryViewArray #6052 [arrow]
- Improve performance of constructing
ByteView
s for small strings #6034 [parquet] [arrow] - Fast UTF-8 validation when reading StringViewArray from Parquet #5995 [parquet]
- Optimize StringView row decoding #5945 [arrow]
- Implementing
deduplicate
/intern
functionality for StringView #5910 [arrow] - Add
FlightSqlServiceClient::new_from_inner
#6003 [arrow] [arrow-flight] (lewiszlw) - Complete
StringViewArray
andBinaryViewArray
parquet decoder: #6004 [parquet] (XiangpengHao) - Add begin/end_transaction methods in FlightSqlServiceClient #6026 [arrow] [arrow-flight] (lewiszlw)
- Read Parquet statistics as arrow
Arrays
#6046 [parquet] (efredine)
Fixed bugs:
- Panic in
ParquetMetadata::memory_size
if no min/max set #6091 [parquet] - BinaryViewArray doesn't roundtrip a single
Some(&[])
through parquet #6086 [parquet] - Parquet
ColumnIndex
for null columns is written even when statistics are disabled #6010 [parquet]
Documentation updates:
- Fix typo in GenericByteViewArray documentation #6054 [arrow] (progval)
- Minor: Improve parquet PageIndex documentation #6042 [parquet] (alamb)
Closed issues:
- Potential performance improvements for reading Parquet to StringViewArray/BinaryViewArray #5904 [parquet] [arrow]
Merged pull requests:
- Faster
GenericByteView
construction #6102 [parquet] [arrow] (XiangpengHao) - Add benchmark to track byte-view construction performance #6101 [parquet] (XiangpengHao)
- Optimize
bool_or
usingmax_boolean
#6100 [arrow] (simonvandel) - Optimize
max_boolean
by operating on u64 chunks #6098 [arrow] (simonvandel) - fix panic in
ParquetMetadata::memory_size
: check has_min_max_set before invoking min()/max() #6092 [parquet] (Fischer0522) - Implement specialized min/max for
GenericBinaryView
(StringView
andBinaryView
) #6089 [arrow] (XiangpengHao) - Add PartialEq to ParquetMetaData and FileMetadata #6082 [parquet] (adriangb)
- Enable casting from Utf8View #6077 [arrow] (a10y)
- StringView support in arrow-csv #6062 [arrow] (2010YOUY01)
- Implement min max support for string/binary view types #6053 [arrow] (XiangpengHao)
- Minor: clarify the relationship between
file::metadata
andformat
in docs #6049 [parquet] (alamb) - Minor API adjustments for StringViewBuilder #6047 [arrow] (XiangpengHao)
- Add parquet
StatisticsConverter
for arrow reader #6046 [parquet] (efredine) - Directly decode String/BinaryView types from arrow-row format #6044 [arrow] (XiangpengHao)
- Clean up unused code for view types in offset buffer #6040 [parquet] (XiangpengHao)
- Avoid using Buffer api that accidentally copies data #6039 [parquet] [arrow] [arrow-flight] (XiangpengHao)
- MINOR: Fix
hashbrown
version inarrow-array
, remove fromarrow-row
#6035 [arrow] (mbrobbel) - Improve performance reading
ByteViewArray
from parquet by removing an implicit copy #6031 [parquet] (XiangpengHao) - Add begin/end_transaction methods in FlightSqlServiceClient #6026 [arrow] [arrow-flight] (lewiszlw)
- Unsafe improvements: core
parquet
crate. #6024 [parquet] (veluca93) - Additional tests for parquet reader utf8 validation #6023 [parquet] (alamb)
- Update zstd-sys requirement from >=2.0.0, <2.0.12 to >=2.0.0, <2.0.13 #6019 [parquet] (dependabot[bot])
- fix doc ci in latest rust nightly version #6012 [arrow] [arrow-flight] (Rachelint)
- Do not write
ColumnIndex
for null columns when not writing page statistics #6011 [parquet] (etseidl) - Fast utf8 validation when loading string view from parquet #6009 [parquet] (XiangpengHao)
- Deduplicate strings/binarys when building view types #6005 [arrow] (XiangpengHao)
- Complete
StringViewArray
andBinaryViewArray
parquet decoder: implement delta byte array and delta length byte array encoding #6004 [parquet] (XiangpengHao) - Add
FlightSqlServiceClient::new_from_inner
#6003 [arrow] [arrow-flight] (lewiszlw) - Rename
Schema::all_fields
toflattened_fields
#6001 [parquet] [arrow] [arrow-flight] (lewiszlw) - Refine documentation and examples for
DataType
#5997 [arrow] (alamb) - implement
DataType::try_form(&str)
#5994 [arrow] (samuelcolvin) - Implement dictionary support for reading ByteView from parquet #5973 [parquet] (XiangpengHao)
* This Changelog was automatically generated by github_changelog_generator