From d9590fab0c6e8bb456b54ebd69a2304423dbad19 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= Date: Fri, 21 Apr 2023 10:37:37 +0200 Subject: [PATCH] MINOR: [Release] Update CHANGELOG.md for 12.0.0 --- CHANGELOG.md | 472 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 472 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 4ecdf628355ea..f6326049553f5 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,4 +1,476 @@ +# Apache Arrow 12.0.0 (2023-04-20 07:00:00) + +## Bug Fixes + +* [GH-14779](https://github.com/apache/arrow/issues/14779) - [C++] Compiling failed on Mac M1 +* [GH-14917](https://github.com/apache/arrow/issues/14917) - [C++] Error out when GTest is compiled with a C++ standard lower than 17 (#34765) +* [GH-14923](https://github.com/apache/arrow/issues/14923) - [C++][Parquet] Fix DELTA_BINARY_PACKED problem on reading the last block with malford bit-width (#15241) +* [GH-15054](https://github.com/apache/arrow/issues/15054) - [C++] Change s3 finalization to happen after arrow threads finished, add pyarrow exit hook (#33858) +* [GH-15098](https://github.com/apache/arrow/issues/15098) - [C++] fix util::EqualityComparable to compile on clang 15 (#33940) +* [GH-15102](https://github.com/apache/arrow/issues/15102) - [C++] Could not decompress arrow stream sent from Java arrow SDK (#15194) +* [GH-15109](https://github.com/apache/arrow/issues/15109) - [Python] Allow creation of non empty struct array with zero field (#33764) +* [GH-15137](https://github.com/apache/arrow/issues/15137) - [C++][CI] Fix ASAN error in streaming JSON reader tests (#33772) +* [GH-15139](https://github.com/apache/arrow/issues/15139) - [C++] Improve bzip2 static library path detection for arrow.pc (#33712) +* [GH-15173](https://github.com/apache/arrow/issues/15173) - [C++][Parquet] Fixing ByteStreamSplit Standard broken (#34140) +* [GH-15212](https://github.com/apache/arrow/issues/15212) - [C++] fix sliced list array writing in ORC (#15213) +* [GH-15247](https://github.com/apache/arrow/issues/15247) - [R] Error when trying to save a data.frame with NULL column names (#34798) +* [GH-15256](https://github.com/apache/arrow/issues/15256) - [C++][Dataset] Add support for writing with Partitioning::Default() (#33674) +* [GH-28074](https://github.com/apache/arrow/issues/28074) - [C++][Dataset] Handle NaNs correctly in Parquet predicate push-down (#15125) +* [GH-31880](https://github.com/apache/arrow/issues/31880) - [Python] Table.filter with expression now preserves order with use_threads=True (#34766) +* [GH-31905](https://github.com/apache/arrow/issues/31905) - [DevTools] Add linting to Cython files (#14662) +* [GH-32512](https://github.com/apache/arrow/issues/32512) - [Docs][R] Update conda install command (#34298) +* [GH-32954](https://github.com/apache/arrow/issues/32954) - [Java][FlightRPC] Remove FlightTestUtil#getStartedServer and bind to port 0 directly (#34357) +* [GH-33287](https://github.com/apache/arrow/issues/33287) - [R] Cannot read_parquet on http URL (#34708) +* [GH-33336](https://github.com/apache/arrow/issues/33336) - [C++][Parquet] Avoid UB on unaligned load (#14488) +* [GH-33466](https://github.com/apache/arrow/issues/33466) - [Go][Parquet] Add support for Dictionary arrays to pqarrow (#34342) +* [GH-33501](https://github.com/apache/arrow/issues/33501) - [Packaging][Release] Add a post-release script to add a new version to conan (#34022) +* [GH-33566](https://github.com/apache/arrow/issues/33566) - [C++] Add support for nullary and n-ary aggregate functions (#15083) +* [GH-33600](https://github.com/apache/arrow/issues/33600) - [Go][Parquet] Panic in bitmap writer (#14989) +* [GH-33616](https://github.com/apache/arrow/issues/33616) - [C++] Reorder group_by so that keys/segment keys come before aggregates (#34551) +* [GH-33689](https://github.com/apache/arrow/issues/33689) - [Python][CI] Re-enable fsspec tests on dask nightly tests (#34925) +* [GH-33697](https://github.com/apache/arrow/issues/33697) - [CI][Python] Nightly test for PySpark 3.2.0 fail with AttributeError on numpy.bool (#33714) +* [GH-33699](https://github.com/apache/arrow/issues/33699) - [C++] Increase timeout of c++ tests when running under valgrind and shorten long tests (#33886) +* [GH-33701](https://github.com/apache/arrow/issues/33701) - [C++] Add support for LTO (link time optimization) build (#33847) +* [GH-33709](https://github.com/apache/arrow/issues/33709) - [R] Remove suffix argument from semi_join and anti_join (#34030) +* [GH-33717](https://github.com/apache/arrow/issues/33717) - [Go] Flight SQL Server handle StreamChunk errors (#33718) +* [GH-33721](https://github.com/apache/arrow/issues/33721) - [CI][R] Disable sccache on test-r-install-local macOS (#34713) +* [GH-33726](https://github.com/apache/arrow/issues/33726) - [CI][Go] Set host name in Go benchmarks (#33728) +* [GH-33727](https://github.com/apache/arrow/issues/33727) - [Python] array() errors if pandas categorical column has dictionary as string not object (#34289) +* [GH-33754](https://github.com/apache/arrow/issues/33754) - [CI] Install brewfile dependencies for verification task jobs on M1 (#33755) +* [GH-33767](https://github.com/apache/arrow/issues/33767) - [Go] Clear out parameter in ArrowArrayStream.get_next (#33768) +* [GH-33777](https://github.com/apache/arrow/issues/33777) - [R] Nightly builds failing due to dataset test not being skipped on builds without datasets module (#33778) +* [GH-33779](https://github.com/apache/arrow/issues/33779) - [R] Nightly builds (R 3.5 and 3.6) failing due to field refs test (#33780) +* [GH-33782](https://github.com/apache/arrow/issues/33782) - [Release] Vote email number of issues is querying JIRA and producing a wrong number (#33791) +* [GH-33783](https://github.com/apache/arrow/issues/33783) - [C#] Update release verification to use .NET 7.0 (#33799) +* [GH-33786](https://github.com/apache/arrow/issues/33786) - [C++] Ignore old system xsimd (#33811) +* [GH-33796](https://github.com/apache/arrow/issues/33796) - [C++] Fix wrong arrow-testing.pc config with system GoogleTest (#33812) +* [GH-33801](https://github.com/apache/arrow/issues/33801) - [Python] Expose C++ ExtensionTypes/ExtensionArrays in pyarrow (#33802) +* [GH-33813](https://github.com/apache/arrow/issues/33813) - [CI][GLib] Use Ruby 3.2 to update bundled MSYS2 (#33815) +* [GH-33816](https://github.com/apache/arrow/issues/33816) - [CI][Conan] Use TARGET_FILE for portability (#33817) +* [GH-33820](https://github.com/apache/arrow/issues/33820) - [CI][Release] Don't libxsimd-dev on Ubuntu 20.04 (#33821) +* [GH-33824](https://github.com/apache/arrow/issues/33824) - [C++] Improve error message on diescovery failure (#33848) +* [GH-33830](https://github.com/apache/arrow/issues/33830) - Clarify handling of Null values in REE encoding (#33831) +* [GH-33849](https://github.com/apache/arrow/issues/33849) - [C++] Fix builds with ARROW_BUILD_SHARED=OFF and ARROW_BUILD_EXAMPLES=ON (#34350) +* [GH-33864](https://github.com/apache/arrow/issues/33864) - [Go] Don't directly coerce cgo.Handle to unsafe.Pointer (#33865) +* [GH-33876](https://github.com/apache/arrow/issues/33876) - [C++][Windows] Use different .pc path for each config (#33907) +* [GH-33882](https://github.com/apache/arrow/issues/33882) - [C++] Don't find .pc files with ARROW_BUILD_STATIC=OFF (#34019) +* [GH-33887](https://github.com/apache/arrow/issues/33887) - [Go] cdata package leaks handles, difficult debugging (#33889) +* [GH-33904](https://github.com/apache/arrow/issues/33904) - [R] improve behavior of s3_bucket - work-around (#34009) +* [GH-33911](https://github.com/apache/arrow/issues/33911) - [C++] Add missing std::forward to Result::ValueOrElse (#33912) +* [GH-33914](https://github.com/apache/arrow/issues/33914) - [Release] Force brew install build-from-source to not install from API (#33915) +* [GH-33920](https://github.com/apache/arrow/issues/33920) - [C++][CI] Disable Flight SQL in sanitizer job (#34014) +* [GH-33932](https://github.com/apache/arrow/issues/33932) - [Go] Fix build RecordBuilder with non-nullable items map field (#33906) +* [GH-33934](https://github.com/apache/arrow/issues/33934) - [Packaging][Linux] Enable Flight for arm64 (#34717) +* [GH-33953](https://github.com/apache/arrow/issues/33953) - [Java] Pass custom headers on every request (#33967) +* [GH-33954](https://github.com/apache/arrow/issues/33954) - [C++][Parquet] Preserve field-id for nested type (#33955) +* [GH-33963](https://github.com/apache/arrow/issues/33963) - [C++] add missing arrow/engine headers (#33964) +* [GH-33970](https://github.com/apache/arrow/issues/33970) - [C#] Make schema field names case sensitive (#33978) +* [GH-33971](https://github.com/apache/arrow/issues/33971) - [C++] Fix AdaptiveIntBuilder to always populate data buffer (#33994) +* [GH-33973](https://github.com/apache/arrow/issues/33973) - [Python][Docs] Update documentation for Parquet filter keyword (#33974) +* [GH-34023](https://github.com/apache/arrow/issues/34023) - [Docs] Version warning about viewing old docs doesn't work for versions >= 10 (#34178) +* [GH-34029](https://github.com/apache/arrow/issues/34029) - [Docs] Add Ninja to packages to install (#34040) +* [GH-34035](https://github.com/apache/arrow/issues/34035) - [C++] Internal header file included from public one breaks build of external projects (#34036) +* [GH-34037](https://github.com/apache/arrow/issues/34037) - [Python][Docs] Fix Table.drop docstring (#34038) +* [GH-34044](https://github.com/apache/arrow/issues/34044) - [Go] Fix build with noasm tag (#34045) +* [GH-34047](https://github.com/apache/arrow/issues/34047) - [C++][FlightRPC] Make DoAction warning less prominent (#34182) +* [GH-34076](https://github.com/apache/arrow/issues/34076) - [C#] Allow schema fields with duplicate names (#34125) +* [GH-34080](https://github.com/apache/arrow/issues/34080) - [Python] Add support for round_binary to python (#34084) +* [GH-34082](https://github.com/apache/arrow/issues/34082) - [Packaging][deb] Follow Debian bookworm image change (#34091) +* [GH-34086](https://github.com/apache/arrow/issues/34086) - [C++][Parquet] Fix writing num_rows to data page v2 (#34096) +* [GH-34088](https://github.com/apache/arrow/issues/34088) - [Python] : Fix typo in get_writer (#34089) +* [GH-34092](https://github.com/apache/arrow/issues/34092) - [R] open_csv_dataset() error if schema supplied and col_names left as TRUE (the default) (#34217) +* [GH-34098](https://github.com/apache/arrow/issues/34098) - [Python][Docs] Fix dataset docstring (#34099) +* [GH-34101](https://github.com/apache/arrow/issues/34101) - [Go][Parquet] NewSchemaManifest creates wrong schema field (#34127) +* [GH-34104](https://github.com/apache/arrow/issues/34104) - [Python] update deduplicate_objects default in docs to match implementation (#34128) +* [GH-34106](https://github.com/apache/arrow/issues/34106) - [C++][Parquet] Fix updating page stats for WriteArrowDictionary (#34107) +* [GH-34138](https://github.com/apache/arrow/issues/34138) - [C++][Parquet] Fix parsing stats from min_value/max_value (#34112) +* [GH-34143](https://github.com/apache/arrow/issues/34143) - [Python][Docs] Add fill_null back to API reference (#34144) +* [GH-34148](https://github.com/apache/arrow/issues/34148) - [C++] Revert zstd back to 1.5.2 (#34190) +* [GH-34150](https://github.com/apache/arrow/issues/34150) - [C++] Fix error due to improper initialization of conversion option defaults (#34209) +* [GH-34150](https://github.com/apache/arrow/issues/34150) - [C++][Python] Fix improper initialization of ConversionOptions (#34156) +* [GH-34163](https://github.com/apache/arrow/issues/34163) - [C++][CI] Ensure using the same Zstandard with bundled ORC (#34164) +* [GH-34165](https://github.com/apache/arrow/issues/34165) - [Python] Extension array data type should default to the storage type if to_pandas_dtype is not implemented (#34559) +* [GH-34175](https://github.com/apache/arrow/issues/34175) - [Docs] Remove Jira from .github/CONTRIBUTING.md (#34205) +* [GH-34188](https://github.com/apache/arrow/issues/34188) - [C++][Benchmark] Add missing BENCHMARK_STATIC_DEFINE for bundled gbenchmark (#34194) +* [GH-34191](https://github.com/apache/arrow/issues/34191) - [C++] Ensure using the same ProtoBuf in bundled ORC (#34192) +* [GH-34206](https://github.com/apache/arrow/issues/34206) - [C++] Don't let jemalloc defines affect unity builds (#34185) +* [GH-34210](https://github.com/apache/arrow/issues/34210) - [C++] Make casting timestamp and duration zero-copy when TimeUnit matches (#34270) +* [GH-34211](https://github.com/apache/arrow/issues/34211) - [R] Make sure Arrow arrays are unmaterialized before attempting to access the underlying ChunkedArray (#34489) +* [GH-34214](https://github.com/apache/arrow/issues/34214) - [C++] Pass OPENSSL_ROOT_HINT to CMAKE_PREFIX_PATH for bundled AWS (#34215) +* [GH-34228](https://github.com/apache/arrow/issues/34228) - [R] Add LIB_DIR when Arrow is found via pkg-config (#34229) +* [GH-34230](https://github.com/apache/arrow/issues/34230) - [Java] Call allocation listener on BaseAllocator#wrapForeignAllocation (#34231) +* [GH-34238](https://github.com/apache/arrow/issues/34238) - [C++][Python] Segfault when calling groupby on table with misaligned chunks +* [GH-34241](https://github.com/apache/arrow/issues/34241) - [C++] Fix ExecSpanIterator to properly initialize empty dictionary arrays (#34246) +* [GH-34244](https://github.com/apache/arrow/issues/34244) - [Go][FlightRPC] SQLite example report Transactions support (#34245) +* [GH-34256](https://github.com/apache/arrow/issues/34256) - [Dev] Update release scripts with main as new default branch (#34413) +* [GH-34269](https://github.com/apache/arrow/issues/34269) - [C++] Fix include file name (#34285) +* [GH-34271](https://github.com/apache/arrow/issues/34271) - [C++] Remove Thrift GitHub archive source url (#34273) +* [GH-34283](https://github.com/apache/arrow/issues/34283) - [Python] Add types_mapper support to index for to_pandas (#34445) +* [GH-34284](https://github.com/apache/arrow/issues/34284) - [Java][FlightRPC] Fixed issue with prepared statement getting sent twice (#34358) +* [GH-34296](https://github.com/apache/arrow/issues/34296) - [C++][CI] Force appveyor builds to use conda-forge and ignore defaults channel (#34297) +* [GH-34301](https://github.com/apache/arrow/issues/34301) - [CI][Packaging][RPM][arm64] Use closer.lua to download KEYS (#34302) +* [GH-34303](https://github.com/apache/arrow/issues/34303) - [CI][Packaging][deb] Use system Meson on Debian GNU/Linux bookworm (#34304) +* [GH-34306](https://github.com/apache/arrow/issues/34306) - [CI][Packaging][RPM] Don't install utf8proc-devel on CentOS Stream 8 (#34307) +* [GH-34308](https://github.com/apache/arrow/issues/34308) - [CI][C++] Use str("") to reset std::stringstream for old g++ (#34317) +* [GH-34309](https://github.com/apache/arrow/issues/34309) - [C++] Disable LTO for aws_lc and s2n-tls (#34349) +* [GH-34324](https://github.com/apache/arrow/issues/34324) - [CI][C++] Specify set element type explicitly for old g++ (#34325) +* [GH-34326](https://github.com/apache/arrow/issues/34326) - [C++][Parquet] Page null_count is incorrect if stats is disabled (#34327) +* [GH-34366](https://github.com/apache/arrow/issues/34366) - [R] Don't getFromNamespace() the dplyr:::check_name() helper (#34369) +* [GH-34367](https://github.com/apache/arrow/issues/34367) - [Java] Fix build error from sequential merges (#34368) +* [GH-34381](https://github.com/apache/arrow/issues/34381) - [Dev] Retrieve committers from arrow-site committers.yml instead of relying on author_association (#34557) +* [GH-34385](https://github.com/apache/arrow/issues/34385) - [Go] Read IPC files with compression enabled but uncompressed buffers (#34476) +* [GH-34395](https://github.com/apache/arrow/issues/34395) - [Python] Add support for symbolic linked Arrow related include directories (#34674) +* [GH-34404](https://github.com/apache/arrow/issues/34404) - [Python] Failing tests because pandas.Index can now store all numeric dtypes (not only 64bit versions) (#34498) +* [GH-34410](https://github.com/apache/arrow/issues/34410) - [Python] Allow chunk sizes larger than the default to be used (#34435) +* [GH-34432](https://github.com/apache/arrow/issues/34432) - [Java] NoCompressionCodec throws for unsupported codec type (#34580) +* [GH-34446](https://github.com/apache/arrow/issues/34446) - [C++][Parquet] Fix RecordReaderPrimitveTypeTests test (#34447) +* [GH-34464](https://github.com/apache/arrow/issues/34464) - [R] Missing rlang import - inform (#34465) +* [GH-34467](https://github.com/apache/arrow/issues/34467) - [R] Disable DuckDB tests on R versions < 4.0.0 (#34468) +* [GH-34472](https://github.com/apache/arrow/issues/34472) - [Go][FlightRPC] Drain result of DoAction in Flight SQL client (#34473) +* [GH-34474](https://github.com/apache/arrow/issues/34474) - [C++] Detect and raise an error if a join will need too much key data (#35087) +* [GH-34479](https://github.com/apache/arrow/issues/34479) - [Java] java-jars failing due to conflicting slf4j bindings (#34480) +* [GH-34492](https://github.com/apache/arrow/issues/34492) - [Go] Fix missing boolean plain encoder state update (#34493) +* [GH-34496](https://github.com/apache/arrow/issues/34496) - [C++][Parquet] fix parquet unittest in `MakePages` when num_values = 0 (#34497) +* [GH-34513](https://github.com/apache/arrow/issues/34513) - [CI][Python] Remove unused imports from _acero.pyx to fix linting failures (#34514) +* [GH-34519](https://github.com/apache/arrow/issues/34519) - [C++][R] Fix dataset scans that project the same name as a field (#34576) +* [GH-34539](https://github.com/apache/arrow/issues/34539) - [C++] Fix throttled scheduler to avoid stack overflow in dataset writer (#35075) +* [GH-34540](https://github.com/apache/arrow/issues/34540) - [C++] Removed set but unused variable (#34541) +* [GH-34546](https://github.com/apache/arrow/issues/34546) - [C++] Support casting from large string to string scalar (#34549) +* [GH-34568](https://github.com/apache/arrow/issues/34568) - [C++][Python] Expose Run-End Encoded arrays in Python Arrow (#34570) +* [GH-34579](https://github.com/apache/arrow/issues/34579) - [Python][Docs] TableGroupBy.aggregate options (#34759) +* [GH-34597](https://github.com/apache/arrow/issues/34597) - [Packaging][RPM] Don't use glog (#34598) +* [GH-34603](https://github.com/apache/arrow/issues/34603) - [Go][Parquet] Problem writing dictionary with empty strings (#34709) +* [GH-34605](https://github.com/apache/arrow/issues/34605) - [C++] Don't use std::move when passing shared_ptr to named table … (#34606) +* [GH-34619](https://github.com/apache/arrow/issues/34619) - [C++] Add extension array handling to ArraySpan conversion (#34684) +* [GH-34621](https://github.com/apache/arrow/issues/34621) - [GLib] Don't use "g_strdup(XXX->ToString().c_str())" (#34624) +* [GH-34622](https://github.com/apache/arrow/issues/34622) - [CI][GLib] Use "meson setup ..." (#34623) +* [GH-34629](https://github.com/apache/arrow/issues/34629) - [Go] Fix transpose_ints to work on riscv64-freebsd (#34647) +* [GH-34633](https://github.com/apache/arrow/issues/34633) - [C++][Parquet] Fix StreamReader to read decimals (#34720) +* [GH-34639](https://github.com/apache/arrow/issues/34639) - [C++] Support RecordBatch::FromStructArray even if struct array has nulls/offsets (#34691) +* [GH-34641](https://github.com/apache/arrow/issues/34641) - [CI][Python] Mark test_scan on test_acero.py to require dataset (#34642) +* [GH-34643](https://github.com/apache/arrow/issues/34643) - [CI] Fix files used for testing uncompressible data (#34646) +* [GH-34653](https://github.com/apache/arrow/issues/34653) - [CI][C++] Fix for arrow-dataset-file-json-test segfault on alpine-linux-cpp (#35047) +* [GH-34655](https://github.com/apache/arrow/issues/34655) - [CI][C++] arrow-compute-internals-test fails with \`No function registered with name: equal\` on test-cuda-cpp +* [GH-34661](https://github.com/apache/arrow/issues/34661) - [CI][C#] Update Ubuntu C# jobs to use image with .NET 7.0 (#34662) +* [GH-34667](https://github.com/apache/arrow/issues/34667) - [C++][Parquet] Test DeltaLengthByteArrayDecoder with invalid inputs (#34668) +* [GH-34670](https://github.com/apache/arrow/issues/34670) - [Packaging][C++] Add support for customizing GDB plugin install directory (#34672) +* [GH-34696](https://github.com/apache/arrow/issues/34696) - [C++] Check REE arrays have no null buffer in Validate() (#34697) +* [GH-34731](https://github.com/apache/arrow/issues/34731) - [Python] Release GIL when creating RecordBatchReader (#34732) +* [GH-34743](https://github.com/apache/arrow/issues/34743) - [Python] Relax condition in flaky Flight test (#34747) +* [GH-34753](https://github.com/apache/arrow/issues/34753) - [C++] Nightly builds failing with EnsureAlignment (#34754) +* [GH-34771](https://github.com/apache/arrow/issues/34771) - [C++] Add support for compiling on FreeBSD/amd64 (#34772) +* [GH-34786](https://github.com/apache/arrow/issues/34786) - [C++] Fix output schema calculated by Substrait consumer for AggregateRel (#34904) +* [GH-34801](https://github.com/apache/arrow/issues/34801) - [C++] Remove needless "Requires.private: libcurl openssl" from arrow.pc (#34810) +* [GH-34807](https://github.com/apache/arrow/issues/34807) - [Go] Handle `io.EOF` when reading parquet footer size and magic bytes (#34808) +* [GH-34823](https://github.com/apache/arrow/issues/34823) - [C++][ORC] Fix ORC CHAR type mapping (#34836) +* [GH-34831](https://github.com/apache/arrow/issues/34831) - [C++] Check REE child buffers are valid before other checks (#34833) +* [GH-34843](https://github.com/apache/arrow/issues/34843) - [R] Fix R build failed caused by Acero refactor (#34844) +* [GH-34862](https://github.com/apache/arrow/issues/34862) - [C++] Fix ArrowDataset dependencies (#34866) +* [GH-34869](https://github.com/apache/arrow/issues/34869) - [C++] Configure alpine linux nightly job to build gtest from source (#34870) +* [GH-34871](https://github.com/apache/arrow/issues/34871) - [C++] Fixed the add_dataset_test function to properly refer to the test file (#34872) +* [GH-34906](https://github.com/apache/arrow/issues/34906) - [C++] Return invalid status instead of segfault if reading from a closed ArrayStreamBatchReader (#35016) +* [GH-34933](https://github.com/apache/arrow/issues/34933) - [Python] Raise minimum cython version (#34935) +* [GH-34937](https://github.com/apache/arrow/issues/34937) - [R] Minimal build failing due to new test which relies on snappy being installed (#34938) +* [GH-34944](https://github.com/apache/arrow/issues/34944) - [Python] Fix crash when converting non-sequence object with getitem in pa.array() (#34958) +* [GH-34953](https://github.com/apache/arrow/issues/34953) - [Ruby] Change null selection behavior in `Table.slice` to `:drop` (#34954) +* [GH-34960](https://github.com/apache/arrow/issues/34960) - [C++] test util Fixing arrow Random Generator for lost nullable info (#34961) +* [GH-34973](https://github.com/apache/arrow/issues/34973) - [CI][Packaging] Fix script path in wheel-clean (#34974) +* [GH-34977](https://github.com/apache/arrow/issues/34977) - [C++] Fix "Requires" format in arrow-dataset.pc (#34978) +* [GH-34983](https://github.com/apache/arrow/issues/34983) - [C++] Preserve map values nullability on C Data Interface import (#35013) +* [GH-34988](https://github.com/apache/arrow/issues/34988) - [C#] Fix Windows-specific test issue in CDataSchemaPythonTest (#34989) +* [GH-34995](https://github.com/apache/arrow/issues/34995) - [C++] Improve available GTest check for SYSTEM case (#34997) +* [GH-35008](https://github.com/apache/arrow/issues/35008) - [C++] Add printers for REETestData and PageIndexReaderParam to placate Valgrind (#35011) +* [GH-35014](https://github.com/apache/arrow/issues/35014) - [Python] Make sure unit tests can run without acero (#35017) +* [GH-35018](https://github.com/apache/arrow/issues/35018) - [CI][Java][C++] Use ARROW_ZSTD_USE_SHARED=OFF for LLVM (#35023) +* [GH-35021](https://github.com/apache/arrow/issues/35021) - [Python][CI] Use conda's gdb in test-conda-python (#35024) +* [GH-35029](https://github.com/apache/arrow/issues/35029) - [CI][C#] Install python on ubuntu-csharp image to fix nuget CI build (#35030) +* [GH-35038](https://github.com/apache/arrow/issues/35038) - [R] argument order in arrow_table affects object return type (#35039) +* [GH-35056](https://github.com/apache/arrow/issues/35056) - [Python][CI] Don't install gdb on Windows (#35057) +* [GH-35060](https://github.com/apache/arrow/issues/35060) - [C#][CI] Update dotnet download link regex (#35061) +* [GH-35062](https://github.com/apache/arrow/issues/35062) - [Go][CI] Fix verification failures (#35077) +* [GH-35063](https://github.com/apache/arrow/issues/35063) - [CI] Fix Python requirement in C# tests (#35091) +* [GH-35066](https://github.com/apache/arrow/issues/35066) - [CI][Packaging][Linux] Free more disk space (#35128) +* [GH-35069](https://github.com/apache/arrow/issues/35069) - [Archery][Release] Remove retrieving ARROW issue from migration comment on Archery release (#35070) +* [GH-35073](https://github.com/apache/arrow/issues/35073) - [R] Minimal build is failing (acero symbol not defined) (#35074) +* [GH-35086](https://github.com/apache/arrow/issues/35086) - [Java][CI] Upgrade CycloneDX Maven plugin version (#35092) +* [GH-35089](https://github.com/apache/arrow/issues/35089) - [CI][C++][Flight] Test failures in macos release verification nightlies (#35090) +* [GH-35115](https://github.com/apache/arrow/issues/35115) - [C++] Moved util_avx2.cc from acero to compute (#35117) +* [GH-35133](https://github.com/apache/arrow/issues/35133) - [Go] fix for `math.MaxUint32 overflows int` error in 32-bit arch (#35159) +* [GH-35143](https://github.com/apache/arrow/issues/35143) - [R][C++] Fixed shape tensor causes broken build on OSX (#35154) +* [GH-35170](https://github.com/apache/arrow/issues/35170) - [CI][Packaging][Conan] Build grpc-proto (#35203) +* [GH-35181](https://github.com/apache/arrow/issues/35181) - [R] Bump R package version number in versions.json (#35132) +* [GH-35186](https://github.com/apache/arrow/issues/35186) - [CI][C++] Improve GoogleTest detection on Windows + vcpkg (#35200) +* [GH-35187](https://github.com/apache/arrow/issues/35187) - [CI][C++] Use the latest arrow-testing (#35227) +* [GH-35192](https://github.com/apache/arrow/issues/35192) - [Docs] Switch from `logo` to `logo_url` to support sphinx >= 6 (#35194) +* [GH-35205](https://github.com/apache/arrow/issues/35205) - [C++][Gandiva] Don't find system Zstandard when we use bundled one (#35220) +* [GH-35206](https://github.com/apache/arrow/issues/35206) - [C++] Look for Conda OpenSSL in Windows verification (#35225) +* [GH-35235](https://github.com/apache/arrow/issues/35235) - [CI][Python] Pandas upstream_devel and nightlies are failing (#35248) +* [GH-35252](https://github.com/apache/arrow/issues/35252) - [C++] Use FindGTestAlt.cmake by ArrowTesting (#35253) + + +## New Features and Improvements + +* [GH-14863](https://github.com/apache/arrow/issues/14863) - [C++] Add appender functions to array builders that can take optionals (#24372) +* [GH-14866](https://github.com/apache/arrow/issues/14866) - [C++] Remove internal GroupBy implementation (#14867) +* [GH-14912](https://github.com/apache/arrow/issues/14912) - [Java] Remove usage of PlatformDependent in arrow-vector, arrow-jdbc and arrow-algorithm (#14913) +* [GH-14939](https://github.com/apache/arrow/issues/14939) - [C++] Support Table lookups in FieldRef and FieldPath (#34537) +* [GH-15059](https://github.com/apache/arrow/issues/15059) - [C++][Acero] populate guarantee columns from expression intstead of fragment (#15129) +* [GH-15070](https://github.com/apache/arrow/issues/15070) - [Python][CI] Update pandas test for empty columns dtype change in pandas 2.0.1 (#35031) +* [GH-15070](https://github.com/apache/arrow/issues/15070) - [Python][CI] Compatibility with pandas 2.0 (#34878) +* [GH-15107](https://github.com/apache/arrow/issues/15107) - [C++][Parquet] Parquet Encoder: Support RLE for Boolean (#34526) +* [GH-15164](https://github.com/apache/arrow/issues/15164) - [C++][Parquet] Implement current version of BloomFilter spec (#33776) +* [GH-15171](https://github.com/apache/arrow/issues/15171) - [C++] Pass std::string_view by value (#33684) +* [GH-15193](https://github.com/apache/arrow/issues/15193) - [C++][Parquet] Parquet FuzzReader add some fixed batch size (#33942) +* [GH-15195](https://github.com/apache/arrow/issues/15195) - [C++][FlightRPC][Python] Add ToString/Equals for Flight types (#15196) +* [GH-15203](https://github.com/apache/arrow/issues/15203) - [Java] Implement writing compressed files (#15223) +* [GH-15209](https://github.com/apache/arrow/issues/15209) - [C++][Gandiva] Add abs function (#15208) +* [GH-15231](https://github.com/apache/arrow/issues/15231) - [C++][Benchmarking] Add new memory pool metrics and track in benchmarks (#33731) +* [GH-15280](https://github.com/apache/arrow/issues/15280) - [C++][Python][GLib] add libarrow_acero containing everything previously in compute/exec (#34711) +* [GH-15280](https://github.com/apache/arrow/issues/15280) - [C++] Refactor to reorganize dependencies as a prequel to moving acero out of libarrow (#34518) +* [GH-15284](https://github.com/apache/arrow/issues/15284) - [C++] Use DeclarationToExecBatches in Acero plan tests (#15288) +* [GH-15285](https://github.com/apache/arrow/issues/15285) - [GLib] Add GArrowMatchSubstringOptions (#34725) +* [GH-15286](https://github.com/apache/arrow/issues/15286) - [GLib] Add GArrowIndexOptions (#34679) +* [GH-15287](https://github.com/apache/arrow/issues/15287) - [Ruby] Merge column and add suffix in Table#join (#33654) +* [GH-15483](https://github.com/apache/arrow/issues/15483) - [C++] Add a Fixed Shape Tensor canonical ExtensionType (#8510) +* [GH-18481](https://github.com/apache/arrow/issues/18481) - [C++] prefer casting literal over casting field ref (#15180) +* [GH-18487](https://github.com/apache/arrow/issues/18487) - [R] Read Text (CSV/JSON) from character vector (#33968) +* [GH-18818](https://github.com/apache/arrow/issues/18818) - [R] Create a field ref to a field in a struct (#19706) +* [GH-20117](https://github.com/apache/arrow/issues/20117) - [Dev] Ask INFRA to switch default branch to main +* [GH-20272](https://github.com/apache/arrow/issues/20272) - [C++] Bump version of bundled AWS SDK (#33808) +* [GH-20351](https://github.com/apache/arrow/issues/20351) - [C++] Kernel input type matcher for run-end encoded types (#34503) +* [GH-20407](https://github.com/apache/arrow/issues/20407) - [Go] Array Builder for REE arrays (#14114) +* [GH-20408](https://github.com/apache/arrow/issues/20408) - [Go] Implement Encode and Decode functions for REE (#34534) +* [GH-20415](https://github.com/apache/arrow/issues/20415) - [Go] Kernel Input Type for RLE (#14146) +* [GH-20484](https://github.com/apache/arrow/issues/20484) - [Swift] Initial Arrow implementation (#14561) +* [GH-21429](https://github.com/apache/arrow/issues/21429) - [GLib] Add GArrowDenseUnionArrayBuilder (#34981) +* [GH-21430](https://github.com/apache/arrow/issues/21430) - [GLib] GArrowSparseUnionArrayBuilder (#34992) +* [GH-25163](https://github.com/apache/arrow/issues/25163) - [C#] Support half-float arrays. (#34618) +* [GH-25986](https://github.com/apache/arrow/issues/25986) - [C++] Enable external material and rotation for encryption keys (#34181) +* [GH-29705](https://github.com/apache/arrow/issues/29705) - [Python] Remove deprecated pyarrow.serialization functionality (#34926) +* [GH-30774](https://github.com/apache/arrow/issues/30774) - [Python] Remove deprecated `use_async` (#34034) +* [GH-31148](https://github.com/apache/arrow/issues/31148) - [Dev] Update URLs in the repo to point to main (#34218) +* [GH-31506](https://github.com/apache/arrow/issues/31506) - [Python] Address docstrings in Streams and File Access (Factory Functions) (#33609) +* [GH-31507](https://github.com/apache/arrow/issues/31507) - [Python] Address docstrings in Streams and File Access (Stream Classes) (#33698) +* [GH-31548](https://github.com/apache/arrow/issues/31548) - [Python] Test that zoneinfo timezones are accepted during type inference (#34394) +* [GH-31715](https://github.com/apache/arrow/issues/31715) - [Python] Improving Classes and Methods Docstrings - Streams and File access +* [GH-31809](https://github.com/apache/arrow/issues/31809) - [Docs] Add instructions on how to collect the produced telemetry data (#33873) +* [GH-31868](https://github.com/apache/arrow/issues/31868) - [C++] Support concatenating extension arrays (#14463) +* [GH-31910](https://github.com/apache/arrow/issues/31910) - [C++] Add support for Substrait cast expression (#34050) +* [GH-32050](https://github.com/apache/arrow/issues/32050) - [C++] Implement Rank kernel on chunked arrays (#33846) +* [GH-32104](https://github.com/apache/arrow/issues/32104) - [C++] Add support for Run-End encoded data to Arrow (#33641) +* [GH-32105](https://github.com/apache/arrow/issues/32105) - [C++] Encode and decode Run-End Encoded vectors (#34195) +* [GH-32240](https://github.com/apache/arrow/issues/32240) - [C#] Add new Apache.Arrow.Compression package to implement IPC decompression (#33893) +* [GH-32240](https://github.com/apache/arrow/issues/32240) - [C#] Support decompression when reading an IPC stream from ReadOnlyMemory (#34108) +* [GH-32240](https://github.com/apache/arrow/issues/32240) - [C#] Support decompression of IPC format buffers (#33603) +* [GH-32292](https://github.com/apache/arrow/issues/32292) - [R][Packaging] Use binaries built on CentOS 7 for Ubuntu < 22.04 (#34048) +* [GH-32338](https://github.com/apache/arrow/issues/32338) - [C++] Add IPC support for Run-End Encoded Arrays (#34550) +* [GH-32613](https://github.com/apache/arrow/issues/32613) - [C++] Simplify IPC writer for dense unions (#33822) +* [GH-32619](https://github.com/apache/arrow/issues/32619) - [Python][Docs] Include options for PyArrow build explicitly (#34463) +* [GH-32653](https://github.com/apache/arrow/issues/32653) - [C++] Cleanup error handling in execution engine (#15253) +* [GH-32747](https://github.com/apache/arrow/issues/32747) - [C++] Substrait To Arrow Emit feature testing (#14174) +* [GH-32801](https://github.com/apache/arrow/issues/32801) - [C++][Docs] Delete outdated .md files (#33829) +* [GH-32804](https://github.com/apache/arrow/issues/32804) - [Dev] Remove "master" from default\_branch property of Target class in core.py after migration to "main" as the default Git branch +* [GH-32916](https://github.com/apache/arrow/issues/32916) - [C++][Python] User-defined tabular functions (#14682) +* [GH-32946](https://github.com/apache/arrow/issues/32946) - [Go] Implement REE Array and Compare (#14111) +* [GH-32947](https://github.com/apache/arrow/issues/32947) - [Go] Implement Concatenate for REE Array (#14126) +* [GH-32949](https://github.com/apache/arrow/issues/32949) - [Go] REE Array IPC read/write (#14223) +* [GH-33024](https://github.com/apache/arrow/issues/33024) - [C++][Parquet] Add DELTA_LENGTH_BYTE_ARRAY encoder to Parquet writer (#14293) +* [GH-33115](https://github.com/apache/arrow/issues/33115) - [C++] Parquet Implement crc in reading and writing Page for DATA_PAGE (v1) (#14351) +* [GH-33143](https://github.com/apache/arrow/issues/33143) - [C++] Naming and doc/test changes for local_time compute kernel (#34263) +* [GH-33143](https://github.com/apache/arrow/issues/33143) - [C++] Kernel to convert timestamp with timezone to wall time (#34208) +* [GH-33209](https://github.com/apache/arrow/issues/33209) - [C++] Support for reading JSON Datasets (#33732) +* [GH-33215](https://github.com/apache/arrow/issues/33215) - [Dev] Replace hard-coded string "master" with "main" in dev/archery/archery/crossbow/core.py after default branch migration +* [GH-33243](https://github.com/apache/arrow/issues/33243) - [Plasma] Remove (#34718) +* [GH-33317](https://github.com/apache/arrow/issues/33317) - [C++] Utility method to ensure an array object meetings an alignment requirement (#14758) +* [GH-33377](https://github.com/apache/arrow/issues/33377) - [Python] Table.drop should support passing a single column (#33810) +* [GH-33439](https://github.com/apache/arrow/issues/33439) - [CI] Substrait Integration Testing (#14596) +* [GH-33580](https://github.com/apache/arrow/issues/33580) - [C++] Support emit info in Substrait extension-multi and AsOfJoin (#14799) +* [GH-33588](https://github.com/apache/arrow/issues/33588) - [Substrait] Add Substrait→Acero mapping for round operationMajor: (#33775) +* [GH-33596](https://github.com/apache/arrow/issues/33596) - [C++][Parquet] Parquet page index read support (#14964) +* [GH-33621](https://github.com/apache/arrow/issues/33621) - [Documentation][Developer Tools] Add CODEOWNERS file (#33622) +* [GH-33631](https://github.com/apache/arrow/issues/33631) - [R] Rewrite Jira ticket numbers in pkgdown documents to GitHub issue numbers (#34260) +* [GH-33640](https://github.com/apache/arrow/issues/33640) - [C++] Add backpressure to asof join node (#33648) +* [GH-33652](https://github.com/apache/arrow/issues/33652) - [C++][Parquet] Add interface total_compressed_bytes_written (#33897) +* [GH-33655](https://github.com/apache/arrow/issues/33655) - [C++][Parquet] Fix occasional failure in TestArrowReadWrite.MultithreadedWrite (#33739) +* [GH-33655](https://github.com/apache/arrow/issues/33655) - [C++][Parquet] Write parquet columns in parallel (#33656) +* [GH-33659](https://github.com/apache/arrow/issues/33659) - [Developer Tools] Add definition of Breaking Change and Critical Fix (#33660) +* [GH-33673](https://github.com/apache/arrow/issues/33673) - [C++] Standardize as-of-join convention for past and future tolerance (#33676) +* [GH-33679](https://github.com/apache/arrow/issues/33679) - [JS] Update dependencies (#33680) +* [GH-33681](https://github.com/apache/arrow/issues/33681) - [JS] Update flatbuffers (#33682) +* [GH-33723](https://github.com/apache/arrow/issues/33723) - [C++] re2::RE2::RE2() result must be checked (#33806) +* [GH-33724](https://github.com/apache/arrow/issues/33724) - [Doc] Update the substrait conformance doc with the latest support (#33725) +* [GH-33734](https://github.com/apache/arrow/issues/33734) - [Go] make compatible with grpc < 1.45 (#33735) +* [GH-33737](https://github.com/apache/arrow/issues/33737) - [C++] simplify exec plan tracing (#33738) +* [GH-33741](https://github.com/apache/arrow/issues/33741) - [Python] Address docstrings in Data Types Factory Functions (#33785) +* [GH-33742](https://github.com/apache/arrow/issues/33742) - [Python] Address docstrings in Data Types classes (#34380) +* [GH-33746](https://github.com/apache/arrow/issues/33746) - [R] Update NEWS.md for 11.0.0 (#33748) +* [GH-33750](https://github.com/apache/arrow/issues/33750) - [GLib] Add garrow_table_batch_reader_set_max_chunk_size() (#34601) +* [GH-33760](https://github.com/apache/arrow/issues/33760) - [R][C++] Handle nested field refs in scanner (#33770) +* [GH-33787](https://github.com/apache/arrow/issues/33787) - [C++] Suppress unused-value warning from LinuxParseCpuFlags() on s390x (#33828) +* [GH-33789](https://github.com/apache/arrow/issues/33789) - [Go] Add Err() to RecordReader (#33792) +* [GH-33794](https://github.com/apache/arrow/issues/33794) - [Go] Add SetRecordReader to PreparedStatement (#33795) +* [GH-33800](https://github.com/apache/arrow/issues/33800) - [Packaging] Drop support for Ubuntu 18.04 (#34020) +* [GH-33825](https://github.com/apache/arrow/issues/33825) - [Python] Expose pyarrow.dataset.get_partition_keys publicly (get key/value from partition expression) (#33862) +* [GH-33835](https://github.com/apache/arrow/issues/33835) - [Doc][Release] Improvements to release guide instructions (#33836) +* [GH-33840](https://github.com/apache/arrow/issues/33840) - [Go] Improve SQLite Flight SQL Example and provide mainprog (#33841) +* [GH-33850](https://github.com/apache/arrow/issues/33850) - [C++] Allow Substrait's default extension provider to be configured (fix) (#34075) +* [GH-33850](https://github.com/apache/arrow/issues/33850) - [C++] Allow Substrait's default extension provider to be configured (#34042) +* [GH-33851](https://github.com/apache/arrow/issues/33851) - [C++] Update bundled boost version (#33890) +* [GH-33852](https://github.com/apache/arrow/issues/33852) - [Go] Return a catalog/schema from Flight SQL example server (#33853) +* [GH-33859](https://github.com/apache/arrow/issues/33859) - [C++][Java] Bump Apache ORC to v1.8.2 (#33860) +* [GH-33867](https://github.com/apache/arrow/issues/33867) - [Go][FlightSQL] Allow passing grpc call options to PreparedStatement methods (#33868) +* [GH-33872](https://github.com/apache/arrow/issues/33872) - [C++] Remove hacky shared_ptr construction in AppendScalar (#33866) +* [GH-33874](https://github.com/apache/arrow/issues/33874) - [Java] Ensure custom headers are included during JDBC auth handshake (#33946) +* [GH-33875](https://github.com/apache/arrow/issues/33875) - [Go] Handle writing LargeString and LargeBinary types (#33965) +* [GH-33892](https://github.com/apache/arrow/issues/33892) - [R] Map `dplyr::n()` to `count_all` kernel (#33917) +* [GH-33895](https://github.com/apache/arrow/issues/33895) - [Release] Add a script to add new owner of our RubyGems (#33896) +* [GH-33899](https://github.com/apache/arrow/issues/33899) - [C++] Add NamedTapRel relation as a Substrait extension (#33909) +* [GH-33901](https://github.com/apache/arrow/issues/33901) - [Go] Add a malloc-based allocator (#33902) +* [GH-33923](https://github.com/apache/arrow/issues/33923) - [Docs] Tensor canonical extension type specification (#33925) +* [GH-33924](https://github.com/apache/arrow/issues/33924) - [Format] Fixed shape Tensor as a canonical extension type +* [GH-33926](https://github.com/apache/arrow/issues/33926) - [Python] DataFrame Interchange Protocol for pyarrow.RecordBatch (#34294) +* [GH-33935](https://github.com/apache/arrow/issues/33935) - [Go][FlightRPC] Implement Flight SQL extensions (#34039) +* [GH-33936](https://github.com/apache/arrow/issues/33936) - [Go] C Data Interface: export dummy buffer for nil buffers (#33951) +* [GH-33957](https://github.com/apache/arrow/issues/33957) - [C++] Add Rank chunked array benchmarks (#34602) +* [GH-33972](https://github.com/apache/arrow/issues/33972) - [C++] Pass in metadata to ParquetReader (#34015) +* [GH-33977](https://github.com/apache/arrow/issues/33977) - [Dev] PR Workflow automation bot (#34161) +* [GH-33990](https://github.com/apache/arrow/issues/33990) - [C++] I know NAN != NAN but shouldn't literal(NAN) == literal(NAN)? +* [GH-33993](https://github.com/apache/arrow/issues/33993) - [Java] Let OS assign port in tests while creating Flight server (#33992) +* [GH-33998](https://github.com/apache/arrow/issues/33998) - [R] Update vignettes to reference the new open_*_dataset functions (#34710) +* [GH-34003](https://github.com/apache/arrow/issues/34003) - [C++][nodiscard] (#34006) +* [GH-34004](https://github.com/apache/arrow/issues/34004) - [C++] Add a benchmarks-maximal CMake preset (#34005) +* [GH-34007](https://github.com/apache/arrow/issues/34007) - [C++] Add an array_span_mutable interface to ExecResult (#34008) +* [GH-34011](https://github.com/apache/arrow/issues/34011) - [Doc] Ensure substrait is enabled on complete doc build (#34024) +* [GH-34011](https://github.com/apache/arrow/issues/34011) - [Python][Doc] Add pyarrow.substrait to pyarrow's API reference docs (#34012) +* [GH-34051](https://github.com/apache/arrow/issues/34051) - [C++] GcsFileSystem lazily starts sequential reads (#34052) +* [GH-34053](https://github.com/apache/arrow/issues/34053) - [C++][Parquet] Write parquet page index (#34054) +* [GH-34055](https://github.com/apache/arrow/issues/34055) - [Go][CI] Add test run in CI that uses noasm tag (#34167) +* [GH-34056](https://github.com/apache/arrow/issues/34056) - [C++] Add Utility function to simplify converting any row-based structure into an `arrow::RecordBatchReader` or an `arrow::Table` (#34057) +* [GH-34059](https://github.com/apache/arrow/issues/34059) - [C++] Add a fetch node based on a batch index (#34060) +* [GH-34063](https://github.com/apache/arrow/issues/34063) - [C++] Avoid waste in `GcsFileSystem::ReadAt()` (#34065) +* [GH-34074](https://github.com/apache/arrow/issues/34074) - [GLib][FlightRPC] Add support for authentication (#34090) +* [GH-34077](https://github.com/apache/arrow/issues/34077) - [Go] Implement RunEndEncoded Scalar (#34079) +* [GH-34078](https://github.com/apache/arrow/issues/34078) - [C++][Parquet] Minor API improvements for BloomFilter (#33995) +* [GH-34094](https://github.com/apache/arrow/issues/34094) - [C++] Increase Boost minimum version for clang >= 16 (#34100) +* [GH-34113](https://github.com/apache/arrow/issues/34113) - [C++][Thirdparty] Bump zstd to v1.5.4 (#34114) +* [GH-34118](https://github.com/apache/arrow/issues/34118) - [C++][Python] Make # of S3 event loop threads configurable (#34134) +* [GH-34119](https://github.com/apache/arrow/issues/34119) - [C#] operator to Schema (#34126) +* [GH-34122](https://github.com/apache/arrow/issues/34122) - [C++] Allow calling function registry functions without requiring a Substrait mapping (#34288) +* [GH-34136](https://github.com/apache/arrow/issues/34136) - [C++] Add a concept of ordering to ExecPlan (#34137) +* [GH-34142](https://github.com/apache/arrow/issues/34142) - [C++][Parquet] Fix record not to span multiple pages (#34193) +* [GH-34147](https://github.com/apache/arrow/issues/34147) - [C++][Parquet] Support crc count and checking on DICTIONARY_PAGE (#34254) +* [GH-34154](https://github.com/apache/arrow/issues/34154) - [Python] Add `is_nan` method to Array and Expression (#34184) +* [GH-34157](https://github.com/apache/arrow/issues/34157) - [C++] Configure bundled AWS SDK to use aws-lc instead of OpenSSL (#34159) +* [GH-34171](https://github.com/apache/arrow/issues/34171) - [Go][Compute] Implement "Unique" kernel (#34172) +* [GH-34174](https://github.com/apache/arrow/issues/34174) - [Docs][Release] Add Twitter to post-release tasks (#34202) +* [GH-34186](https://github.com/apache/arrow/issues/34186) - [Go] Add arrow.MapOfWithMetadata to support (#34207) +* [GH-34197](https://github.com/apache/arrow/issues/34197) - [R][CI] Add previous R package versions to backwards compatibility CI jobs (#34198) +* [GH-34199](https://github.com/apache/arrow/issues/34199) - [R] Increment R package version in NEWS.md (#34200) +* [GH-34219](https://github.com/apache/arrow/issues/34219) - [Go][FlightRPC] Add Transactions to Sqlite FlightSQL example (#34220) +* [GH-34242](https://github.com/apache/arrow/issues/34242) - [C++][Parquet] Optimize comment and move for shared_ptr in parquet schema (#34243) +* [GH-34248](https://github.com/apache/arrow/issues/34248) - [Python] Expose the order_by node (#34654) +* [GH-34248](https://github.com/apache/arrow/issues/34248) - [C++] Add an order_by node (#34249) +* [GH-34257](https://github.com/apache/arrow/issues/34257) - [Docs] Update git links/branches from master to main for external projects (#34502) +* [GH-34262](https://github.com/apache/arrow/issues/34262) - [C++][ORC] Support union type (#34416) +* [GH-34266](https://github.com/apache/arrow/issues/34266) - [C++] Add a pivot_longer node (#34267) +* [GH-34278](https://github.com/apache/arrow/issues/34278) - [C++] Expose schema in named table provider (#34279) +* [GH-34280](https://github.com/apache/arrow/issues/34280) - [C++][Python] Clarify meaning of row_group_size and change default to 1Mi (#34281) +* [GH-34322](https://github.com/apache/arrow/issues/34322) - [C++][Parquet] Encoding Microbench for ByteArray (#34323) +* [GH-34330](https://github.com/apache/arrow/issues/34330) - [Go][Parquet] : Add Extension type support (#34631) +* [GH-34332](https://github.com/apache/arrow/issues/34332) - [Go][FlightRPC] Add driver for `database/sql` framework (#34331) +* [GH-34334](https://github.com/apache/arrow/issues/34334) - [Go][CSV] Support list fields (#34343) +* [GH-34335](https://github.com/apache/arrow/issues/34335) - [C++][Parquet] Optimize Decoding DELTA_LENGTH_BYTE_ARRAY (#34955) +* [GH-34339](https://github.com/apache/arrow/issues/34339) - [R] Add `skip_rows_after_names` option to `read_csv_arrow`'s options (#34340) +* [GH-34359](https://github.com/apache/arrow/issues/34359) - [Python] Add select method to pyarrow.RecordBatch (#34360) +* [GH-34361](https://github.com/apache/arrow/issues/34361) - [C++] Fix the handling of logical nulls for types without bitmaps like Unions and Run-End Encoded (#34408) +* [GH-34382](https://github.com/apache/arrow/issues/34382) - [C++] Support more types in run_end_encode and run_end_decode functions (#34761) +* [GH-34388](https://github.com/apache/arrow/issues/34388) - [C++] Build core compute kernels unconditionally (#34295) +* [GH-34398](https://github.com/apache/arrow/issues/34398) - [R] Update NEWS.md for 11.0.0.3 (#34399) +* [GH-34405](https://github.com/apache/arrow/issues/34405) - [C++] Add support for custom names in QueryOptions. Wire this up to Substrait (#34406) +* [GH-34411](https://github.com/apache/arrow/issues/34411) - [Python] Change array constructor to accept pyarrow array (#34275) +* [GH-34417](https://github.com/apache/arrow/issues/34417) - [C++][Flight] Upgrade OpenTelemetry SemanticConventions header (#34419) +* [GH-34421](https://github.com/apache/arrow/issues/34421) - [R] Let GcsFileSystem take a path for json_credentials (#34524) +* [GH-34422](https://github.com/apache/arrow/issues/34422) - [R] Expose GcsFileSystem$options (#34477) +* [GH-34425](https://github.com/apache/arrow/issues/34425) - [GLib] Add GArrowRankOptions (#34458) +* [GH-34428](https://github.com/apache/arrow/issues/34428) - [Python][Docs] Add docsstring for `make_fragment` (#34429) +* [GH-34437](https://github.com/apache/arrow/issues/34437) - [R] Use FetchNode and OrderByNode (#34685) +* [GH-34440](https://github.com/apache/arrow/issues/34440) - [Ruby] Add support for `RecordBatch{File,Stream}Reader#each` without block (#34441) +* [GH-34442](https://github.com/apache/arrow/issues/34442) - [Ruby][FlightRPC] Add `ArrowFlight::RecordBatchReader#each` (#34444) +* [GH-34453](https://github.com/apache/arrow/issues/34453) - [Go] Support Builders for user defined extensions (#34454) +* [GH-34481](https://github.com/apache/arrow/issues/34481) - [CI] Migrate ARM jobs from Travis to self-hosted runners (#34482) +* [GH-34499](https://github.com/apache/arrow/issues/34499) - [R] Bump version in NEWS.md following release (#34500) +* [GH-34536](https://github.com/apache/arrow/issues/34536) - [Parquet][C++] Overwrite default config for DeltaBitPackEncoder (#34632) +* [GH-34543](https://github.com/apache/arrow/issues/34543) - [CI] Self-hosted ARM workflows improvements (#34512) +* [GH-34547](https://github.com/apache/arrow/issues/34547) - [C++][ORC] Remove deprecated ORC_UNIQUE_PTR (#34548) +* [GH-34552](https://github.com/apache/arrow/issues/34552) - [C++][Parquet] Sync parquet.thrift from upstream (#34553) +* [GH-34561](https://github.com/apache/arrow/issues/34561) - [C++] Implement RunEndEncodedBuilder::AppendEmptyValues() (#34562) +* [GH-34564](https://github.com/apache/arrow/issues/34564) - [Python][C++] Update code to compile with cython 3 (#34726) +* [GH-34565](https://github.com/apache/arrow/issues/34565) - [C++] Teach dataset_writer to accept custom filename functor (#34984) +* [GH-34572](https://github.com/apache/arrow/issues/34572) - [Go][CSV] Add binary support for CSV (#34558) +* [GH-34581](https://github.com/apache/arrow/issues/34581) - [C++][Java] Bump Apache ORC to v1.8.3 (#34582) +* [GH-34584](https://github.com/apache/arrow/issues/34584) - [Go][CSV] Add extension types support (#34585) +* [GH-34590](https://github.com/apache/arrow/issues/34590) - [C++][ORC] Fix timestamp type mapping between orc and arrow (#34591) +* [GH-34595](https://github.com/apache/arrow/issues/34595) - [C++] Update google-cloud-cpp to v2.8.0 (#34707) +* [GH-34615](https://github.com/apache/arrow/issues/34615) - [CI][C++] Add CI job for basic format support without ARROW_COMPUTE (#34617) +* [GH-34626](https://github.com/apache/arrow/issues/34626) - [C++] Add ordered/segmented aggregation Substrait extension (#34627) +* [GH-34630](https://github.com/apache/arrow/issues/34630) - [C++] Second block of refactoring to move acero out of libarrow (#34575) +* [GH-34638](https://github.com/apache/arrow/issues/34638) - [C++][Docs] Add documentation for minimal build flags (#34693) +* [GH-34644](https://github.com/apache/arrow/issues/34644) - [C++] Prefer unsafe casting by default in Substrait (#34645) +* [GH-34650](https://github.com/apache/arrow/issues/34650) - [GLib] Add GArrowFilterNodeOptions (#34663) +* [GH-34659](https://github.com/apache/arrow/issues/34659) - [C++] Review the validation processes around Run-End Encoded arrays to improve the Python integration (#34628) +* [GH-34665](https://github.com/apache/arrow/issues/34665) - [Parquet][C++] Allow Reading BloomFilter (#34728) +* [GH-34669](https://github.com/apache/arrow/issues/34669) - [Packaging][Conda] Update arrow feedstock dependencies (#34652) +* [GH-34673](https://github.com/apache/arrow/issues/34673) - [C++][Parquet] Add Boolean Encoding benchmark for parquet (#34676) +* [GH-34686](https://github.com/apache/arrow/issues/34686) - [Python] Add RunEndEncodedScalar class (#34924) +* [GH-34687](https://github.com/apache/arrow/issues/34687) - [CI][Python] Create job to remove old nightly wheels from gemfury (#34705) +* [GH-34692](https://github.com/apache/arrow/issues/34692) - [Java] Expose Location.toSocketAddress (#34648) +* [GH-34700](https://github.com/apache/arrow/issues/34700) - [Packaging][RPM] Use lz4-libs instead of lz4 on AlmaLinux 8+ (#34716) +* [GH-34703](https://github.com/apache/arrow/issues/34703) - [Python] Set copy=False explicitly when creating a pandas Series (#34593) +* [GH-34737](https://github.com/apache/arrow/issues/34737) - [C#] C Data interface for schemas and types (#34133) +* [GH-34742](https://github.com/apache/arrow/issues/34742) - [Java] Split flight-sql-jdbc-driver to facilitate reuse (#34678) +* [GH-34768](https://github.com/apache/arrow/issues/34768) - [C++][Gandiva] Remove LLVM<16 pin (#34922) +* [GH-34768](https://github.com/apache/arrow/issues/34768) - [C++][Gandiva] Accept LLVM 16 (#34916) +* [GH-34778](https://github.com/apache/arrow/issues/34778) - [Java] Only apply ServerInterceptorAdapter logic to Flight service requests (#34815) +* [GH-34790](https://github.com/apache/arrow/issues/34790) - [Go] : Add array.Edits.UnifiedDiff (#34827) +* [GH-34790](https://github.com/apache/arrow/issues/34790) - [Go] : Add array.Diff() (#34806) +* [GH-34796](https://github.com/apache/arrow/issues/34796) - [C++] Add FromTensor, ToTensor and strides methods to FixedShapeTensorArray (#34797) +* [GH-34802](https://github.com/apache/arrow/issues/34802) - [C++][Parquet] Allow passing pool to decoder (#34803) +* [GH-34805](https://github.com/apache/arrow/issues/34805) - [CI][Python] Cython test is failing in conda packaging builds +* [GH-34812](https://github.com/apache/arrow/issues/34812) - [Packaging][Python] Use self-hosted arm64 Linux runner instead of Travis CI for Linux arm64 wheels (#34835) +* [GH-34813](https://github.com/apache/arrow/issues/34813) - [C++] Improve GoogleTest detection (#34920) +* [GH-34819](https://github.com/apache/arrow/issues/34819) - [Ruby] Add Slicer::ColumnCondition#match_substring (#34902) +* [GH-34821](https://github.com/apache/arrow/issues/34821) - [DOC][ORC] Update documentation for ORC (#34822) +* [GH-34832](https://github.com/apache/arrow/issues/34832) - [Go] Add Record SetColumn method (#34794) +* [GH-34837](https://github.com/apache/arrow/issues/34837) - [GLib][Ruby] Add Arrow::{Sparse,Dense}UnionArray#get_value (#34838) +* [GH-34839](https://github.com/apache/arrow/issues/34839) - [Go] Build compute without noasm for non-amd64 GOARCH (#34840) +* [GH-34853](https://github.com/apache/arrow/issues/34853) - [Go] Add TotalRecordSize, TotalArraySize (#34854) +* [GH-34855](https://github.com/apache/arrow/issues/34855) - [Go] Add GetValue function to Metadata (#34856) +* [GH-34863](https://github.com/apache/arrow/issues/34863) - [Go] Pow method for Decimal DataTypes (#34864) +* [GH-34879](https://github.com/apache/arrow/issues/34879) - [Python][CI] Nightly integration tests with latest dask are failing (test\_null\_partition\_pyarrow) +* [GH-34880](https://github.com/apache/arrow/issues/34880) - [Python][CI] Fix Windows tests failing with latest pandas 2.0 (#34881) +* [GH-34882](https://github.com/apache/arrow/issues/34882) - [Python] Binding for FixedShapeTensorType (#34883) +* [GH-34888](https://github.com/apache/arrow/issues/34888) - [C++][Parquet] Writer supports adding extra kv meta (#34889) +* [GH-34893](https://github.com/apache/arrow/issues/34893) - [C++] Fix run-end encoded array iterator issues that manifest on backwards iteration (#34896) +* [GH-34899](https://github.com/apache/arrow/issues/34899) - [C++] Dependency: bump zstd to v1.5.5 (#34900) +* [GH-34914](https://github.com/apache/arrow/issues/34914) - [Packaging][Linux] Add support for Acero (#34915) +* [GH-34945](https://github.com/apache/arrow/issues/34945) - [C++][Docs] Add missing cmake_minimum_required() to example (#34969) +* [GH-34946](https://github.com/apache/arrow/issues/34946) - [Ruby] Remove DictionaryArrayBuilder related omissions (#34947) +* [GH-34951](https://github.com/apache/arrow/issues/34951) - [Ruby] Add methods using MatchSubStringFamilyCondition (#34952) +* [GH-34956](https://github.com/apache/arrow/issues/34956) - [Docs][Python] Add to docs the usage of the FixedShapeTensorType (#34957) +* [GH-34962](https://github.com/apache/arrow/issues/34962) - [Go] Make GetOneForMarshal public on Array interface (#34964) +* [GH-34968](https://github.com/apache/arrow/issues/34968) - [C++] Add Equal Options to RecordBatch (#34970) +* [GH-35025](https://github.com/apache/arrow/issues/35025) - [Python] Remove use of deprecated pandas.Categorical fastpath keyword (#35026) +* [GH-35042](https://github.com/apache/arrow/issues/35042) - [Go][FlightSQL driver] Add TLS configuration (#35051) +* [GH-35078](https://github.com/apache/arrow/issues/35078) - [Python][CI] Tests on windows are running very slow +* [GH-35218](https://github.com/apache/arrow/issues/35218) - [R] Update NEWS for the R component/version 12.0.0 (#35219) +* [PARQUET-2201](https://issues.apache.org/jira/browse/PARQUET-2201) - [parquet-cpp] Add stress test for RecordReader ReadRecords and SkipRecords. (#14879) +* [PARQUET-2225](https://issues.apache.org/jira/browse/PARQUET-2225) - [C++][Parquet] Allow reading dense with RecordReader (#17877) +* [PARQUET-2232](https://issues.apache.org/jira/browse/PARQUET-2232) - [C++] Add an api to ColumnChunkMetaData to indicate if the column chunk uses a bloom filter (#33736) +* [PARQUET-2250](https://issues.apache.org/jira/browse/PARQUET-2250) - [C++][Parquet] Expose column descriptor through RecordReader (#34318) + + + # Apache Arrow 6.0.1 (2021-11-18) ## Bug Fixes