From 769a675b364840e0007bcd23c3607163801e47a3 Mon Sep 17 00:00:00 2001 From: Benson Muite Date: Fri, 7 Jan 2022 01:11:01 +0300 Subject: [PATCH] Add more information on SIMD (#1138) --- arrow/CONTRIBUTING.md | 10 +++++++++- arrow/README.md | 4 ++-- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/arrow/CONTRIBUTING.md b/arrow/CONTRIBUTING.md index 843e1faf05e7..9ec2d5a52540 100644 --- a/arrow/CONTRIBUTING.md +++ b/arrow/CONTRIBUTING.md @@ -99,7 +99,15 @@ The arrow format declares a IPC protocol, which this crate supports. IPC is equi #### SIMD -The API provided by the `packed_simd` library is currently `unsafe`. However, SIMD offers a significant performance improvement over non-SIMD operations. +The API provided by the [packed_simd_2](https://docs.rs/packed_simd_2/latest/packed_simd_2/) crate is currently `unsafe`. However, +SIMD offers a significant performance improvement over non-SIMD operations. A related crate in development is +[portable-simd](https://rust-lang.github.io/portable-simd/core_simd/) which has a nice +[beginners guide](https://github.com/rust-lang/portable-simd/blob/master/beginners-guide.md). These crates provide the ability +for code on x86 and ARM architectures to use some of the available parallel register operations. As an example if two arrays +of numbers are added, [1,2,3,4] + [5,6,7,8], rather than using four instructions to add each of the elements of the arrays, +one instruction can be used to all all four elements at the same time, which leads to improved time to solution. SIMD instructions +are typically most effective when data is aligned to allow a single load instruction to bring multiple consecutive data elements +to the registers, before use of a SIMD instruction. #### Performance diff --git a/arrow/README.md b/arrow/README.md index 960b718f458e..9ca8e5320a51 100644 --- a/arrow/README.md +++ b/arrow/README.md @@ -40,8 +40,8 @@ The arrow crate provides the following features which may be enabled: - `prettyprint` - support for formatting record batches as textual columns - `js` - support for building arrow for WebAssembly / JavaScript - `simd` - (_Requires Nightly Rust_) alternate optimized - implementations of some [compute](https://github.com/apache/arrow/tree/master/rust/arrow/src/compute) - kernels using explicit SIMD processor intrinsics. + implementations of some [compute](https://github.com/apache/arrow-rs/tree/master/arrow/src/compute/kernels) + kernels using explicit SIMD instructions available through [packed_simd_2](https://docs.rs/packed_simd_2/latest/packed_simd_2/). - `chrono-tz` - support of parsing timezone using [chrono-tz](https://docs.rs/chrono-tz/0.6.0/chrono_tz/) ## Safety