Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use prettier to auto format md files #398

Merged
merged 1 commit into from
Jun 5, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion .github/workflows/dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ env:
ARCHERY_DOCKER_PASSWORD: ${{ secrets.DOCKERHUB_TOKEN }}

jobs:

lint:
name: Lint C++, Python, R, Rust, Docker, RAT
runs-on: ubuntu-latest
Expand All @@ -41,3 +40,16 @@ jobs:
run: pip install -e dev/archery[docker]
- name: Lint
run: archery lint --rat
prettier:
name: Use prettier to check formatting of documents
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v2
with:
node-version: "14"
- name: Prettier check
run: |
# if you encounter error, try rerun the command below with --write instead of --check
# and commit the changes
npx [email protected] --check {arrow,arrow-flight,dev,integration-testing,parquet}/**/*.md README.md CODE_OF_CONDUCT.md CONTRIBUTING.md
4 changes: 2 additions & 2 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,6 @@

# Code of Conduct

* [Code of Conduct for The Apache Software Foundation][1]
- [Code of Conduct for The Apache Software Foundation][1]

[1]: https://www.apache.org/foundation/policies/conduct.html
[1]: https://www.apache.org/foundation/policies/conduct.html
26 changes: 13 additions & 13 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,15 @@

## Did you find a bug?

The Arrow project uses JIRA as a bug tracker. To report a bug, you'll have
The Arrow project uses JIRA as a bug tracker. To report a bug, you'll have
to first create an account on the
[Apache Foundation JIRA](https://issues.apache.org/jira/). The JIRA server
hosts bugs and issues for multiple Apache projects. The JIRA project name
[Apache Foundation JIRA](https://issues.apache.org/jira/). The JIRA server
hosts bugs and issues for multiple Apache projects. The JIRA project name
for Arrow is "ARROW".

To be assigned to an issue, ask an Arrow JIRA admin to go to
[Arrow Roles](https://issues.apache.org/jira/plugins/servlet/project-config/ARROW/roles),
click "Add users to a role," and add you to the "Contributor" role. Most
click "Add users to a role," and add you to the "Contributor" role. Most
committers are authorized to do this; if you're a committer and aren't
able to load that project admin page, have someone else add you to the
necessary role.
Expand All @@ -39,29 +39,29 @@ Before you create a new bug entry, we recommend you first
among existing Arrow issues.

When you create a new JIRA entry, please don't forget to fill the "Component"
field. Arrow has many subcomponents and this helps triaging and filtering
tremendously. Also, we conventionally prefix the issue title with the component
field. Arrow has many subcomponents and this helps triaging and filtering
tremendously. Also, we conventionally prefix the issue title with the component
name in brackets, such as "[C++] Crash in Array::Frobnicate()", so as to make
lists more easy to navigate, and we'd be grateful if you did the same.

## Did you write a patch that fixes a bug or brings an improvement?

First create a JIRA entry as described above. Then, submit your changes
as a GitHub Pull Request. We'll ask you to prefix the pull request title
First create a JIRA entry as described above. Then, submit your changes
as a GitHub Pull Request. We'll ask you to prefix the pull request title
with the JIRA issue number and the component name in brackets.
(for example: "ARROW-2345: [C++] Fix crash in Array::Frobnicate()").
Respecting this convention makes it easier for us to process the backlog
of submitted Pull Requests.

### Minor Fixes

Any functionality change should have a JIRA opened. For minor changes that
affect documentation, you do not need to open up a JIRA. Instead you can
Any functionality change should have a JIRA opened. For minor changes that
affect documentation, you do not need to open up a JIRA. Instead you can
prefix the title of your PR with "MINOR: " if meets the following guidelines:

* Grammar, usage and spelling fixes that affect no more than 2 files
* Documentation updates affecting no more than 2 files and not more
than 500 words.
- Grammar, usage and spelling fixes that affect no more than 2 files
- Documentation updates affecting no more than 2 files and not more
than 500 words.

## Do you want to propose a significant new feature or an important refactoring?

Expand Down
34 changes: 17 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,29 +25,29 @@ Welcome to the implementation of Arrow, the popular in-memory columnar format, i

This part of the Arrow project is divided in 4 main components:

| Crate | Description | Documentation |
|-----------|-------------|---------------|
|Arrow | Core functionality (memory layout, arrays, low level computations) | [(README)](arrow/README.md) |
|Parquet | Parquet support | [(README)](parquet/README.md) |
|Arrow-flight | Arrow data between processes | [(README)](arrow-flight/README.md) |
|DataFusion | In-memory query engine with SQL support | [(README)](https://github.com/apache/arrow-datafusion/blob/master/README.md) |
|Ballista | Distributed query execution | [(README)](https://github.com/apache/arrow-datafusion/blob/master/ballista/README.md) |
| Crate | Description | Documentation |
| ------------ | ------------------------------------------------------------------ | ------------------------------------------------------------------------------------- |
| Arrow | Core functionality (memory layout, arrays, low level computations) | [(README)](arrow/README.md) |
| Parquet | Parquet support | [(README)](parquet/README.md) |
| Arrow-flight | Arrow data between processes | [(README)](arrow-flight/README.md) |
| DataFusion | In-memory query engine with SQL support | [(README)](https://github.com/apache/arrow-datafusion/blob/master/README.md) |
| Ballista | Distributed query execution | [(README)](https://github.com/apache/arrow-datafusion/blob/master/ballista/README.md) |

Independently, they support a vast array of functionality for in-memory computations.

Together, they allow users to write an SQL query or a `DataFrame` (using the `datafusion` crate), run it against a parquet file (using the `parquet` crate), evaluate it in-memory using Arrow's columnar format (using the `arrow` crate), and send to another process (using the `arrow-flight` crate).

Generally speaking, the `arrow` crate offers functionality to develop code that uses Arrow arrays, and `datafusion` offers most operations typically found in SQL, with the notable exceptions of:

* `join`
* `window` functions
- `join`
- `window` functions

There are too many features to enumerate here, but some notable mentions:

* `Arrow` implements all formats in the specification except certain dictionaries
* `Arrow` supports SIMD operations to some of its vertical operations
* `DataFusion` supports `async` execution
* `DataFusion` supports user-defined functions, aggregates, and whole execution nodes
- `Arrow` implements all formats in the specification except certain dictionaries
- `Arrow` supports SIMD operations to some of its vertical operations
- `DataFusion` supports `async` execution
- `DataFusion` supports user-defined functions, aggregates, and whole execution nodes

You can find more details about each crate in their respective READMEs.

Expand Down Expand Up @@ -118,7 +118,6 @@ export ARROW_TEST_DATA=$(cd ../testing/data; pwd)

From here on, this is a pure Rust project and `cargo` can be used to run tests, benchmarks, docs and examples as usual.


### Running the tests

Run tests using the Rust standard `cargo test` command:
Expand Down Expand Up @@ -156,9 +155,10 @@ If you use Visual Studio Code with the `rust-analyzer` plugin, you can enable `c
One of the concerns with `clippy` is that it often produces a lot of false positives, or that some recommendations may hurt readability. We do not have a policy of which lints are ignored, but if you disagree with a `clippy` lint, you may disable the lint and briefly justify it.

Search for `allow(clippy::` in the codebase to identify lints that are ignored/allowed. We currently prefer ignoring lints on the lowest unit possible.
* If you are introducing a line that returns a lint warning or error, you may disable the lint on that line.
* If you have several lints on a function or module, you may disable the lint on the function or module.
* If a lint is pervasive across multiple modules, you may disable it at the crate level.

- If you are introducing a line that returns a lint warning or error, you may disable the lint on that line.
- If you have several lints on a function or module, you may disable the lint on the function or module.
- If a lint is pervasive across multiple modules, you may disable it at the crate level.

## Git Pre-Commit Hook

Expand Down
36 changes: 18 additions & 18 deletions arrow/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,12 +79,12 @@ The above script will run the `flatc` compiler and perform some adjustments to t

Arrow uses the following features:

* `simd` - Arrow uses the [packed_simd](https://crates.io/crates/packed_simd) crate to optimize many of the
implementations in the [compute](https://github.com/apache/arrow/tree/master/rust/arrow/src/compute)
module using SIMD intrinsics. These optimizations are turned *off* by default.
If the `simd` feature is enabled, an unstable version of Rust is required (we test with `nightly-2021-03-24`)
* `flight` which contains useful functions to convert between the Flight wire format and Arrow data
* `prettyprint` which is a utility for printing record batches
- `simd` - Arrow uses the [packed_simd](https://crates.io/crates/packed_simd) crate to optimize many of the
implementations in the [compute](https://github.com/apache/arrow/tree/master/rust/arrow/src/compute)
module using SIMD intrinsics. These optimizations are turned _off_ by default.
If the `simd` feature is enabled, an unstable version of Rust is required (we test with `nightly-2021-03-24`)
- `flight` which contains useful functions to convert between the Flight wire format and Arrow data
- `prettyprint` which is a utility for printing record batches

Other than `simd` all the other features are enabled by default. Disabling `prettyprint` might be necessary in order to
compile Arrow to the `wasm32-unknown-unknown` WASM target.
Expand All @@ -99,12 +99,12 @@ This crate only accepts the usage of `unsafe` code upon careful consideration, a

Generally, `unsafe` should only be used when a `safe` counterpart is not available and there is no `safe` way to achieve additional performance in that area. The following is a summary of the current components of the crate that require `unsafe`:

* alloc, dealloc and realloc of buffers along cache lines
* Interpreting bytes as certain rust types, for access, representation and compute
* Foreign interfaces (C data interface)
* Inter-process communication (IPC)
* SIMD
* Performance (e.g. omit bounds checks, use of pointers to avoid bound checks)
- alloc, dealloc and realloc of buffers along cache lines
- Interpreting bytes as certain rust types, for access, representation and compute
- Foreign interfaces (C data interface)
- Inter-process communication (IPC)
- SIMD
- Performance (e.g. omit bounds checks, use of pointers to avoid bound checks)

#### cache-line aligned memory management

Expand Down Expand Up @@ -147,13 +147,13 @@ Usage of `unsafe` for performance reasons is justified only when all other alter

### Considerations when introducing `unsafe`

Usage of `unsafe` in this crate *must*:
Usage of `unsafe` in this crate _must_:

* not expose a public API as `safe` when there are necessary invariants for that API to be defined behavior.
* have code documentation for why `safe` is not used / possible
* have code documentation about which invariant the user needs to enforce to ensure [soundness](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#soundness-of-code--of-a-library), or which
* invariant is being preserved.
* if applicable, use `debug_assert`s to relevant invariants (e.g. bound checks)
- not expose a public API as `safe` when there are necessary invariants for that API to be defined behavior.
- have code documentation for why `safe` is not used / possible
- have code documentation about which invariant the user needs to enforce to ensure [soundness](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#soundness-of-code--of-a-library), or which
- invariant is being preserved.
- if applicable, use `debug_assert`s to relevant invariants (e.g. bound checks)

Example of code documentation:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@ Revision: {revision}

Submitted crossbow builds: [{repo} @ {branch}](https://github.com/{repo}/branches/all?query={branch})

|Task|Status|
|----|------|
|docker-cpp-cmake32|[![CircleCI](https://img.shields.io/circleci/build/github/{repo}/{branch}-circle-docker-cpp-cmake32.svg)](https://circleci.com/gh/{repo}/tree/{branch}-circle-docker-cpp-cmake32)|
|wheel-osx-cp36m|[![TravisCI](https://img.shields.io/travis/{repo}/{branch}-travis-wheel-osx-cp36m.svg)](https://travis-ci.com/{repo}/branches)|
|wheel-osx-cp37m|[![TravisCI](https://img.shields.io/travis/{repo}/{branch}-travis-wheel-osx-cp37m.svg)](https://travis-ci.com/{repo}/branches)|
|wheel-win-cp36m|[![Appveyor](https://img.shields.io/appveyor/ci/{repo}/{branch}-appveyor-wheel-win-cp36m.svg)](https://ci.appveyor.com/project/{repo}/history)|
| Task | Status |
| ------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| docker-cpp-cmake32 | [![CircleCI](https://img.shields.io/circleci/build/github/{repo}/{branch}-circle-docker-cpp-cmake32.svg)](https://circleci.com/gh/{repo}/tree/{branch}-circle-docker-cpp-cmake32) |
| wheel-osx-cp36m | [![TravisCI](https://img.shields.io/travis/{repo}/{branch}-travis-wheel-osx-cp36m.svg)](https://travis-ci.com/{repo}/branches) |
| wheel-osx-cp37m | [![TravisCI](https://img.shields.io/travis/{repo}/{branch}-travis-wheel-osx-cp37m.svg)](https://travis-ci.com/{repo}/branches) |
| wheel-win-cp36m | [![Appveyor](https://img.shields.io/appveyor/ci/{repo}/{branch}-appveyor-wheel-win-cp36m.svg)](https://ci.appveyor.com/project/{repo}/history) |
Loading