diff --git a/README.md b/README.md index ccb527a1f977..63da8c1c1a96 100644 --- a/README.md +++ b/README.md @@ -35,9 +35,33 @@ Here are links to some important information - [Python DataFrame API](https://arrow.apache.org/datafusion-python/) - [Architecture](https://docs.rs/datafusion/latest/datafusion/index.html#architecture) -## Building your project with DataFusion +## What can you do with this crate? -DataFusion is great for building projects and products like SQL interfaces, time series platforms, and domain specific query engines. [Click Here](https://arrow.apache.org/datafusion/user-guide/introduction.html#known-users) to see a list known users. +DataFusion is great for building projects such as domain specific query engines, new database platforms and data pipelines, query languages and more. +It lets you start quickly from a fully working engine, and then customize those features specific to your use. [Click Here](https://arrow.apache.org/datafusion/user-guide/introduction.html#known-users) to see a list known users. + +## Crate features + +Default features: + +- `compression`: reading files compressed with `xz2`, `bzip2`, `flate2`, and `zstd` +- `crypto_expressions`: cryptographic functions such as `md5` and `sha256` +- `encoding_expressions`: `encode` and `decode` functions +- `regex_expressions`: regular expression functions, such as `regexp_match` +- `unicode_expressions`: Include unicode aware functions such as `character_length` + +Optional features: + +- `avro`: support for reading the [Apache Avro] format +- `backtrace`: include backtrace information in error messages +- `pyarrow`: conversions between PyArrow and DataFusion types +- `simd`: enable arrow-rs's manual `SIMD` kernels (requires Rust `nightly`) + +[apache avro]: https://avro.apache.org/ + +## Rust Version Compatibility + +This crate is tested with the latest stable version of Rust. We do not currently test against other, older versions of the Rust compiler. ## Contributing to DataFusion diff --git a/datafusion/common/src/pyarrow.rs b/datafusion/common/src/pyarrow.rs index d18782e037ae..d78aa8b988f7 100644 --- a/datafusion/common/src/pyarrow.rs +++ b/datafusion/common/src/pyarrow.rs @@ -15,7 +15,7 @@ // specific language governing permissions and limitations // under the License. -//! PyArrow +//! Conversions between PyArrow and DataFusion types use arrow::array::ArrayData; use arrow::pyarrow::{FromPyArrow, ToPyArrow}; diff --git a/datafusion/core/Cargo.toml b/datafusion/core/Cargo.toml index 266ff855752b..d84d6a13c336 100644 --- a/datafusion/core/Cargo.toml +++ b/datafusion/core/Cargo.toml @@ -39,8 +39,8 @@ avro = ["apache-avro", "num-traits", "datafusion-common/avro"] backtrace = ["datafusion-common/backtrace"] compression = ["xz2", "bzip2", "flate2", "zstd", "async-compression"] crypto_expressions = ["datafusion-physical-expr/crypto_expressions", "datafusion-optimizer/crypto_expressions"] -default = ["crypto_expressions", "encoding__expressions", "regex_expressions", "unicode_expressions", "compression"] -encoding__expressions = ["datafusion-physical-expr/encoding_expressions"] +default = ["crypto_expressions", "encoding_expressions", "regex_expressions", "unicode_expressions", "compression"] +encoding_expressions = ["datafusion-physical-expr/encoding_expressions"] # Used for testing ONLY: causes all values to hash to the same value (test for collisions) force_hash_collisions = [] pyarrow = ["datafusion-common/pyarrow"] diff --git a/docs/source/user-guide/example-usage.md b/docs/source/user-guide/example-usage.md index adaf780558bc..c631d552dd73 100644 --- a/docs/source/user-guide/example-usage.md +++ b/docs/source/user-guide/example-usage.md @@ -187,10 +187,6 @@ DataFusion is designed to be extensible at all points. To that end, you can prov - [x] User Defined `LogicalPlan` nodes - [x] User Defined `ExecutionPlan` nodes -## Rust Version Compatibility - -This crate is tested with the latest stable version of Rust. We do not currently test against other, older versions of the Rust compiler. - ## Optimized Configuration For an optimized build several steps are required. First, use the below in your `Cargo.toml`. It is