forked from apache/arrow-rs
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework ipc_compression
feature flags
#1
Draft
alamb
wants to merge
129
commits into
liukun4515:flight_data_compression
Choose a base branch
from
alamb:alamb/help_feature_flags
base: flight_data_compression
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Rework ipc_compression
feature flags
#1
alamb
wants to merge
129
commits into
liukun4515:flight_data_compression
from
alamb:alamb/help_feature_flags
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* parquet: export json api with `serde_json` feature name * chore: don't piggyback on optional feature name
* add instructions Signed-off-by: remzi <[email protected]> * fmt Signed-off-by: remzi <[email protected]> * update discord link Signed-off-by: remzi <[email protected]>
* Add AmazonS3Config, MicrosoftAzureBuilder, GoogleCloudStorageBuilder * fix: improve docs * review feedback: remove old code, make with_client test only
* add append_option support to decimal builders * fix linting * pr comments
* Rename DataType::Decimal to DataType::Decimal128 * Update doc
) * split Signed-off-by: remzi <[email protected]> * rename Signed-off-by: remzi <[email protected]>
* Add LimitStore (apache#2175) * Review feedback * Fix test
…che#2231) * Automatically grow parquet BitWriter (apache#2226) * Review feedback
Signed-off-by: remzi <[email protected]>
…pache#2221) * Optimized writing of byte array to parquet (apache#1764) * Review feedback * Fix logical conflict
…cations (apache#2235) * Fix bug * Add tests * Update arrow/src/datatypes/types.rs Co-authored-by: Liang-Chi Hsieh <[email protected]> * Update arrow/src/datatypes/types.rs Co-authored-by: Andrew Lamb <[email protected]> Co-authored-by: Liang-Chi Hsieh <[email protected]>
* Update prost requirement from 0.10 to 0.11 Updates the requirements on [prost](https://github.com/tokio-rs/prost) to permit the latest version. - [Release notes](https://github.com/tokio-rs/prost/releases) - [Commits](tokio-rs/prost@v0.10.0...v0.11.0) --- updated-dependencies: - dependency-name: prost dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Update tonic-build requirement from 0.7 to 0.8 Updates the requirements on [tonic-build](https://github.com/hyperium/tonic) to permit the latest version. - [Release notes](https://github.com/hyperium/tonic/releases) - [Changelog](https://github.com/hyperium/tonic/blob/master/CHANGELOG.md) - [Commits](hyperium/tonic@v0.7.0...v0.8.0) --- updated-dependencies: - dependency-name: tonic-build dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Update tonic requirement from 0.7 to 0.8 Updates the requirements on [tonic](https://github.com/hyperium/tonic) to permit the latest version. - [Release notes](https://github.com/hyperium/tonic/releases) - [Changelog](https://github.com/hyperium/tonic/blob/master/CHANGELOG.md) - [Commits](hyperium/tonic@v0.7.0...v0.8.0) --- updated-dependencies: - dependency-name: tonic dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Update prost-derive requirement from 0.10 to 0.11 Updates the requirements on [prost-derive](https://github.com/tokio-rs/prost) to permit the latest version. - [Release notes](https://github.com/tokio-rs/prost/releases) - [Commits](tokio-rs/prost@v0.10.0...v0.11.0) --- updated-dependencies: - dependency-name: prost-derive dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Update prost-types requirement from 0.10.0 to 0.11.0 Updates the requirements on [prost-types](https://github.com/tokio-rs/prost) to permit the latest version. - [Release notes](https://github.com/tokio-rs/prost/releases) - [Commits](tokio-rs/prost@v0.10.0...v0.11.0) --- updated-dependencies: - dependency-name: prost-types dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Update vendored tonic/prost generated code * Install protoc in CI builds Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Add FromIterator * For review
* update binary_from_list Signed-off-by: remzi <[email protected]> * fix binary from list Signed-off-by: remzi <[email protected]> * fix decimal from fixed list Signed-off-by: remzi <[email protected]> * fix fixed binary from fixed list Signed-off-by: remzi <[email protected]> * fix string from list Signed-off-by: remzi <[email protected]> * add child length check Signed-off-by: remzi <[email protected]> * clean the code Signed-off-by: remzi <[email protected]>
…he#2275) * fix bug: decimal cmp * optimizer the error message * address comment
…2251) * feat: Implement string cast operations for Time32 and Time64 * chore: Remove unnecessary leap second handling Remove the unnecessary conditionals to extract the leap second, as it is already handled when converting to a time unit relative to midnight 🤦🏻♂️ * chore: Inline trivial functions
* Handle symlinks in LocalFileSystem (apache#2206) * Update object_store/src/local.rs Co-authored-by: Andrew Lamb <[email protected]> Co-authored-by: Andrew Lamb <[email protected]>
* Improve crates.io page * Improve builder doc examples * Add examples in main library docs * Apply suggestions from code review Co-authored-by: Raphael Taylor-Davies <[email protected]> Co-authored-by: Raphael Taylor-Davies <[email protected]>
* Fix coverage and mac jobs -- still need to fix windows * try and fix coverage * comment out coverage
…pache#2237) * replace ArrayReader::next_batch with ArrayReader::read_records and ArrayReader::consume_batch. * fix ut * fix comment * avoid clone. * fix new ut * fix comment Co-authored-by: Raphael Taylor-Davies <[email protected]>
…dictionaries (apache#2391) * fix: Don't instantiate the scalar composition code quadratically for dictionaries Instead, re-use the ones normal function. Reduces how much code `datafusion-physical-expr` generated significantly (since the functions are generic, and not instantiated in `arrow` itself, it only shows up downstream). https://github.com/apache/arrow-datafusion There is technically an extra indirect call now as the recursive call to `eq_dyn_scalar` etc coerces to a `dyn Array` again but that seems unlikely to matter. ## cargo llvm-lines -p datafusion-physical-expr ### Before ``` Lines Copies Function name ----- ------ ------------- 2270242 (100%) 38377 (100%) (TOTAL) 245854 (10.8%) 5580 (14.5%) core::option::Option<T>::ok_or_else 58690 (2.6%) 10 (0.0%) arrow::compute::kernels::comparison::eq_dyn_scalar 58690 (2.6%) 10 (0.0%) arrow::compute::kernels::comparison::gt_dyn_scalar 58690 (2.6%) 10 (0.0%) arrow::compute::kernels::comparison::gt_eq_dyn_scalar 58690 (2.6%) 10 (0.0%) arrow::compute::kernels::comparison::lt_dyn_scalar 58690 (2.6%) 10 (0.0%) arrow::compute::kernels::comparison::lt_eq_dyn_scalar 58690 (2.6%) 10 (0.0%) arrow::compute::kernels::comparison::neq_dyn_scalar 55800 (2.5%) 900 (2.3%) arrow::compute::kernels::comparison::eq_dyn_scalar::{{closure}} 55800 (2.5%) 900 (2.3%) arrow::compute::kernels::comparison::gt_dyn_scalar::{{closure}} 55800 (2.5%) 900 (2.3%) arrow::compute::kernels::comparison::gt_eq_dyn_scalar::{{closure}} 55800 (2.5%) 900 (2.3%) arrow::compute::kernels::comparison::lt_dyn_scalar::{{closure}} 55800 (2.5%) 900 (2.3%) arrow::compute::kernels::comparison::lt_eq_dyn_scalar::{{closure}} 55800 (2.5%) 900 (2.3%) arrow::compute::kernels::comparison::neq_dyn_scalar::{{closure}} 44929 (2.0%) 900 (2.3%) core::option::Option<T>::map 40986 (1.8%) 162 (0.4%) <arrow::array::array_boolean::BooleanArray as core::iter::traits::collect::FromIterator<Ptr>>::from_iter 37528 (1.7%) 508 (1.3%) core::iter::traits::iterator::Iterator::fold 30595 (1.3%) 245 (0.6%) <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter 29272 (1.3%) 46 (0.1%) <core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::size_hint 27815 (1.2%) 285 (0.7%) core::iter::traits::iterator::Iterator::try_fold 26014 (1.1%) 1 (0.0%) datafusion_physical_expr::expressions::binary::BinaryExpr::evaluate_array_scalar 25095 (1.1%) 441 (1.1%) core::iter::adapters::map::map_fold::{{closure}} 22849 (1.0%) 174 (0.5%) <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::try_fold::{{closure}} 21888 (1.0%) 96 (0.3%) arrow::compute::kernels::comparison::compare_op_scalar 21464 (0.9%) 56 (0.1%) <arrow::array::array_string::GenericStringArray<OffsetSize> as core::iter::traits::collect::FromIterator<core::option::Option<Ptr>>>::from_iter 21461 (0.9%) 441 (1.1%) <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::fold 19918 (0.9%) 118 (0.3%) arrow::buffer::mutable::MutableBuffer::from_trusted_len_iter 16916 (0.7%) 246 (0.6%) <alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend ``` ### After ``` Lines Copies Function name ----- ------ ------------- 1475122 (100%) 28777 (100%) (TOTAL) 44929 (3.0%) 900 (3.1%) core::option::Option<T>::map 40986 (2.8%) 162 (0.6%) <arrow::array::array_boolean::BooleanArray as core::iter::traits::collect::FromIterator<Ptr>>::from_iter 37528 (2.5%) 508 (1.8%) core::iter::traits::iterator::Iterator::fold 34174 (2.3%) 780 (2.7%) core::option::Option<T>::ok_or_else 30595 (2.1%) 245 (0.9%) <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter 29272 (2.0%) 46 (0.2%) <core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::size_hint 27815 (1.9%) 285 (1.0%) core::iter::traits::iterator::Iterator::try_fold 26014 (1.8%) 1 (0.0%) datafusion_physical_expr::expressions::binary::BinaryExpr::evaluate_array_scalar 25095 (1.7%) 441 (1.5%) core::iter::adapters::map::map_fold::{{closure}} 22849 (1.5%) 174 (0.6%) <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::try_fold::{{closure}} 21888 (1.5%) 96 (0.3%) arrow::compute::kernels::comparison::compare_op_scalar 21464 (1.5%) 56 (0.2%) <arrow::array::array_string::GenericStringArray<OffsetSize> as core::iter::traits::collect::FromIterator<core::option::Option<Ptr>>>::from_iter 21461 (1.5%) 441 (1.5%) <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::fold 19918 (1.4%) 118 (0.4%) arrow::buffer::mutable::MutableBuffer::from_trusted_len_iter 16916 (1.1%) 246 (0.9%) <alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend 16146 (1.1%) 960 (3.3%) core::iter::adapters::map::Map<I,F>::new 15492 (1.1%) 427 (1.5%) core::iter::traits::iterator::Iterator::for_each 14921 (1.0%) 111 (0.4%) alloc::vec::Vec<T,A>::extend_desugared 14670 (1.0%) 126 (0.4%) core::iter::adapters::try_process 13918 (0.9%) 1 (0.0%) datafusion_physical_expr::expressions::binary::BinaryExpr::evaluate_scalar_array 13120 (0.9%) 64 (0.2%) <arrow::array::array_primitive::PrimitiveArray<T> as core::iter::traits::collect::FromIterator<Ptr>>::from_iter 12963 (0.9%) 52 (0.2%) <core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::try_fold 12245 (0.8%) 180 (0.6%) <core::iter::adapters::enumerate::Enumerate<I> as core::iter::traits::iterator::Iterator>::fold::enumerate::{{closure}} 12201 (0.8%) 81 (0.3%) arrow::buffer::mutable::MutableBuffer::extend_from_iter 11826 (0.8%) 162 (0.6%) <arrow::array::array_boolean::BooleanArray as core::iter::traits::collect::FromIterator<Ptr>>::from_iter::{{closure}} 11536 (0.8%) 960 (3.3%) core::iter::traits::iterator::Iterator::map 11200 (0.8%) 32 (0.1%) alloc::raw_vec::RawVec<T,A>::grow_amortized ``` * refactor: Avoid instantiating a quadratic number of closures due to try_to_type in comparisons (-4%) Reduces the number of llvm-lines in datafusion-physical-expr by another 4%
…apache#2401) * Exclude tags when generating changelogs * Fix release-tarball typo
* Add RowFilter API * Review feedback * Fix doc * Fix handling of NULL boolean array * Add tests, fix bugs * Fix clippy * Review feedback * Fix doc
* Upgrade ahash to 0.8 * Use hash_one * Use hash_one * Use hash_one * Use compile-time-rng for wasm * Use compile-time-rng for wasm * Use compile-time-rng for wasm * Clippy * Revert "Clippy" This reverts commit 4c693cb.
Bumps [actions/checkout](https://github.com/actions/checkout) from 2 to 3. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v2...v3) --- updated-dependencies: - dependency-name: actions/checkout dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/labeler](https://github.com/actions/labeler) from 2.2.0 to 4.0.0. - [Release notes](https://github.com/actions/labeler/releases) - [Commits](actions/labeler@2.2.0...v4.0.0) --- updated-dependencies: - dependency-name: actions/labeler dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 1 to 4. - [Release notes](https://github.com/actions/setup-python/releases) - [Commits](actions/setup-python@v1...v4) --- updated-dependencies: - dependency-name: actions/setup-python dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 2 to 3. - [Release notes](https://github.com/actions/setup-node/releases) - [Commits](actions/setup-node@v2...v3) --- updated-dependencies: - dependency-name: actions/setup-node dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Implement Skip for DeltaBitPackDecoder * move check out of loop * add bench * change to use batch read.
…he#2407) * Support peek_next_page and skip_next_page in InMemoryPageReader * fix comment
…ew_bytes` and add length bound for `Decimal::raw_value` (apache#2405) * add bound Signed-off-by: remzi <[email protected]> * update doc Signed-off-by: remzi <[email protected]> Signed-off-by: remzi <[email protected]>
Signed-off-by: remzi <[email protected]> Signed-off-by: remzi <[email protected]>
Co-authored-by: Kun Liu <[email protected]> Co-authored-by: Liang-Chi Hsieh <[email protected]> Co-authored-by: Raphael Taylor-Davies <[email protected]>
…into alamb/help_feature_flags
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently a draft as I haven't completed this work yet
Note this target's the same branch as apache#1855
Rationale
Feature flags are somewhat of a pain to deal with in arrow. This PR attempts to clean up the feature_flag handling in @liukun4515 's PR to add ipc compression, apache#1855
Changes
CompressionCodecType
with the same interface.Result
)