Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Minor] Improve arrow and parquet READMEs, document parquet feature flags #2324

Merged
merged 4 commits into from
Aug 5, 2022

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Aug 4, 2022

Which issue does this PR close?

Noticed while I was working on #2172

Rationale for this change

The parquet crate has a non trivial number of feature flags. Let's consistently document the flags in crates.io and link to them in docs.rs

https://crates.io/crates/arrow
https://crates.io/crates/parquet

Also while I was messing around I found some other places where the docs could be improved so I did.

What changes are included in this PR?

  1. Document parquet feature flags
  2. Misc wording improvements

Are there any user-facing changes?

readme is different

@github-actions github-actions bot added arrow Changes to the arrow crate parquet Changes to the parquet crate labels Aug 4, 2022
- `chrono-tz` - support of parsing timezone using [chrono-tz](https://docs.rs/chrono-tz/0.6.0/chrono_tz/)
- `ffi` - bindings for the Arrow C [C Data Interface](https://arrow.apache.org/docs/format/CDataInterface.html)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newly added feature flags

@@ -18,41 +18,8 @@
//! A complete, safe, native Rust implementation of [Apache Arrow](https://arrow.apache.org), a cross-language
//! development platform for in-memory data.
//!
//! # Performance Tips
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this section was just introduced by @tustvold in #2305 but it turns out there was already a section at the end of the README.md document that had the same information. I consolidated them in README.md and added a link "please see crates.io page" here

@@ -33,16 +33,16 @@ We use the term "kernel" to refer to particular general operation that contains

Types of functions

* Scalar functions: elementwise functions that perform scalar operations in a
- Scalar functions: elementwise functions that perform scalar operations in a
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the prettier tool did this


## Performance
Arrow aims to be as fast as possible out of the box, whilst not compromising on safety. However,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This content was moved from what was added in #2305


The `parquet` crate provides the following features which may be enabled in your `Cargo.toml`:

- `arrow` (default) - support for reading / writing [`arrow`](https://crates.io/crates/arrow) arrays to / from parquet
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here are the parquet feature flags I could find

@alamb alamb added the documentation Improvements or additions to documentation label Aug 4, 2022
@codecov-commenter
Copy link

Codecov Report

Merging #2324 (175e80e) into master (d87f6a4) will decrease coverage by 0.00%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #2324      +/-   ##
==========================================
- Coverage   80.53%   80.53%   -0.01%     
==========================================
  Files         248      248              
  Lines       59971    59971              
==========================================
- Hits        48296    48295       -1     
- Misses      11675    11676       +1     
Impacted Files Coverage Δ
parquet_derive/src/parquet_field.rs 65.75% <0.00%> (-0.23%) ⬇️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@@ -55,25 +64,25 @@ Arrow seeks to uphold the Rust Soundness Pledge as articulated eloquently [here]

Where soundness in turn is defined as:

> Code is unable to trigger undefined behaviour using safe APIs
> Code is unable to trigger undefined behavior using safe APIs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... I shall British-ify the spelling one comment at a time 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol I am "correcting" the spelling (or maybe I can claim I am "standardizing" the spelling 😆

@tustvold tustvold merged commit 8e30d06 into apache:master Aug 5, 2022
Most of the compute kernels benefit a lot from being optimized for a specific CPU target.
This is especially so on x86-64 since without specifying a target the compiler can only assume support for SSE2 vector instructions.
One of the following values as `-Ctarget-cpu=value` in `RUSTFLAGS` can therefore improve performance significantly:
To address this it is recommended that you specify the override the LLVM defaults either
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops this keeps a typo from my PR, "specify the override the". Will fix

@ursabot
Copy link

ursabot commented Aug 5, 2022

Benchmark runs are scheduled for baseline = 297a8fa and contender = 8e30d06. 8e30d06 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@alamb alamb deleted the alamb/document_feature_flags branch August 5, 2022 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate documentation Improvements or additions to documentation parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants