Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datum based comparison kernels (#4596) #4701

Merged
merged 11 commits into from
Aug 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 4 additions & 6 deletions .github/workflows/arrow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,8 +80,8 @@ jobs:
run: cargo test -p arrow-json --all-features
- name: Test arrow-string with all features
run: cargo test -p arrow-string --all-features
- name: Test arrow-ord with all features except SIMD
run: cargo test -p arrow-ord --features dyn_cmp_dict
- name: Test arrow-ord with all features
run: cargo test -p arrow-ord --all-features
- name: Test arrow-arith with all features except SIMD
run: cargo test -p arrow-arith
- name: Test arrow-row with all features
Expand Down Expand Up @@ -145,8 +145,6 @@ jobs:
rust-version: nightly
- name: Test arrow-array with SIMD
run: cargo test -p arrow-array --features simd
- name: Test arrow-ord with SIMD
run: cargo test -p arrow-ord --features simd
- name: Test arrow-arith with SIMD
run: cargo test -p arrow-arith --features simd
- name: Test arrow with SIMD
Expand Down Expand Up @@ -206,8 +204,8 @@ jobs:
run: cargo clippy -p arrow-json --all-targets --all-features -- -D warnings
- name: Clippy arrow-string with all features
run: cargo clippy -p arrow-string --all-targets --all-features -- -D warnings
- name: Clippy arrow-ord with all features except SIMD
run: cargo clippy -p arrow-ord --all-targets --features dyn_cmp_dict -- -D warnings
- name: Clippy arrow-ord with all features
run: cargo clippy -p arrow-ord --all-targets --all-features -- -D warnings
- name: Clippy arrow-arith with all features except SIMD
run: cargo clippy -p arrow-arith --all-targets -- -D warnings
- name: Clippy arrow-row with all features
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/miri.sh
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ cargo miri test -p arrow-data --features ffi
cargo miri test -p arrow-schema --features ffi
cargo miri test -p arrow-array
cargo miri test -p arrow-arith --features simd
cargo miri test -p arrow-ord --features simd
cargo miri test -p arrow-ord
7 changes: 4 additions & 3 deletions arrow-flight/src/sql/metadata/db_schemas.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@
use std::sync::Arc;

use arrow_arith::boolean::and;
use arrow_array::{builder::StringBuilder, ArrayRef, RecordBatch};
use arrow_ord::comparison::eq_utf8_scalar;
use arrow_array::{builder::StringBuilder, ArrayRef, RecordBatch, Scalar, StringArray};
use arrow_ord::cmp::eq;
use arrow_schema::{DataType, Field, Schema, SchemaRef};
use arrow_select::{filter::filter_record_batch, take::take};
use arrow_string::like::like_utf8_scalar;
Expand Down Expand Up @@ -129,7 +129,8 @@ impl GetDbSchemasBuilder {
}

if let Some(catalog_filter_name) = catalog_filter {
filters.push(eq_utf8_scalar(&catalog_name, &catalog_filter_name)?);
let scalar = StringArray::from_iter_values([catalog_filter_name]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I user I find this construction somewhat awkward to create a simple scalar

What would you think about creating convenience functions that would let this code be like:

let scalar = Scalar::new_utf8(catalog_fitler_name);

(doesn't have to be part of this PR, I can add it as a follow on ticket/PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have a think about what we can do here

filters.push(eq(&catalog_name, &Scalar::new(&scalar))?);
}

// `AND` any filters together
Expand Down
14 changes: 8 additions & 6 deletions arrow-flight/src/sql/metadata/sql_info.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,9 @@ use arrow_array::builder::{
ArrayBuilder, BooleanBuilder, Int32Builder, Int64Builder, Int8Builder, ListBuilder,
MapBuilder, StringBuilder, UInt32Builder,
};
use arrow_array::cast::downcast_array;
use arrow_array::RecordBatch;
use arrow_array::{RecordBatch, Scalar};
use arrow_data::ArrayData;
use arrow_ord::comparison::eq_scalar;
use arrow_ord::cmp::eq;
use arrow_schema::{DataType, Field, Fields, Schema, SchemaRef, UnionFields, UnionMode};
use arrow_select::filter::filter_record_batch;
use once_cell::sync::Lazy;
Expand Down Expand Up @@ -425,13 +424,16 @@ impl SqlInfoData {
&self,
info: impl IntoIterator<Item = u32>,
) -> Result<RecordBatch> {
let arr: UInt32Array = downcast_array(self.batch.column(0).as_ref());
let arr = self.batch.column(0);
let type_filter = info
.into_iter()
.map(|tt| eq_scalar(&arr, tt))
.map(|tt| {
let s = UInt32Array::from(vec![tt]);
eq(arr, &Scalar::new(&s))
Comment on lines +431 to +432
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I think this would read better like

                let s = Scalar::new_uint32(tt);
                eq(arr, &Scalar::new(&s))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal in #4704

})
.collect::<std::result::Result<Vec<_>, _>>()?
.into_iter()
// We know the arrays are of same length as they are produced fromn the same root array
// We know the arrays are of same length as they are produced from the same root array
.reduce(|filter, arr| or(&filter, &arr).unwrap());
if let Some(filter) = type_filter {
Ok(filter_record_batch(&self.batch, &filter)?)
Expand Down
12 changes: 8 additions & 4 deletions arrow-flight/src/sql/metadata/tables.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ use std::sync::Arc;

use arrow_arith::boolean::{and, or};
use arrow_array::builder::{BinaryBuilder, StringBuilder};
use arrow_array::{ArrayRef, RecordBatch};
use arrow_ord::comparison::eq_utf8_scalar;
use arrow_array::{ArrayRef, RecordBatch, Scalar, StringArray};
use arrow_ord::cmp::eq;
use arrow_schema::{DataType, Field, Schema, SchemaRef};
use arrow_select::{filter::filter_record_batch, take::take};
use arrow_string::like::like_utf8_scalar;
Expand Down Expand Up @@ -184,12 +184,16 @@ impl GetTablesBuilder {
let mut filters = vec![];

if let Some(catalog_filter_name) = catalog_filter {
filters.push(eq_utf8_scalar(&catalog_name, &catalog_filter_name)?);
let scalar = StringArray::from_iter_values([catalog_filter_name]);
filters.push(eq(&catalog_name, &Scalar::new(&scalar))?);
}

let tt_filter = table_types_filter
.into_iter()
.map(|tt| eq_utf8_scalar(&table_type, &tt))
.map(|tt| {
let scalar = StringArray::from_iter_values([tt]);
eq(&table_type, &Scalar::new(&scalar))
})
.collect::<std::result::Result<Vec<_>, _>>()?
.into_iter()
// We know the arrays are of same length as they are produced fromn the same root array
Expand Down
9 changes: 4 additions & 5 deletions arrow-flight/src/sql/metadata/xdbc_info.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,8 @@
use std::sync::Arc;

use arrow_array::builder::{BooleanBuilder, Int32Builder, ListBuilder, StringBuilder};
use arrow_array::cast::downcast_array;
use arrow_array::{ArrayRef, Int32Array, ListArray, RecordBatch};
use arrow_ord::comparison::eq_scalar;
use arrow_array::{ArrayRef, Int32Array, ListArray, RecordBatch, Scalar};
use arrow_ord::cmp::eq;
use arrow_schema::{DataType, Field, Schema, SchemaRef};
use arrow_select::filter::filter_record_batch;
use arrow_select::take::take;
Expand Down Expand Up @@ -81,8 +80,8 @@ impl XdbcTypeInfoData {
/// from [`CommandGetXdbcTypeInfo`]
pub fn record_batch(&self, data_type: impl Into<Option<i32>>) -> Result<RecordBatch> {
if let Some(dt) = data_type.into() {
let arr: Int32Array = downcast_array(self.batch.column(1).as_ref());
let filter = eq_scalar(&arr, dt)?;
let scalar = Int32Array::from(vec![dt]);
let filter = eq(self.batch.column(1), &Scalar::new(&scalar))?;
Ok(filter_record_batch(&self.batch, &filter)?)
} else {
Ok(self.batch.clone())
Expand Down
7 changes: 0 additions & 7 deletions arrow-ord/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,3 @@ half = { version = "2.1", default-features = false, features = ["num-traits"] }

[dev-dependencies]
rand = { version = "0.8", default-features = false, features = ["std", "std_rng"] }

[package.metadata.docs.rs]
features = ["dyn_cmp_dict"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an API change? (if someone was using arrow-ord directly)

I see that dyn_cmp_dict is still used in arrow / arrow-string


[features]
dyn_cmp_dict = []
simd = ["arrow-array/simd"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SIMD variants added a non-trivial amount of code complexity, and behaved differently. I opted to simply remove them, we can always add back some sort of SIMD specialization for primitives in future should users feel strongly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For other kernels as I recall we found properly written rust code would be vectorized by llvm more effectively than our hand rolled simd kernels. Do you think that is still the case?

cc @jhorstmann

Copy link
Contributor Author

@tustvold tustvold Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is empirically not the case here, LLVM is really bad at optimising horizontal operations such as creating a packed bitmask from a comparison. The LLVM generated kernels are ~2x slower

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, though I feel like llvm has gotten a little better at this. on the other hand, the simd kernels also did not generate "perfect" code due to combining multiple masks into a single u64 (except for u8 comparisons which have 64 lanes).

I think the tradeoff against improved compile time is ok, and longer term this allows more code cleanups. The decimal types for example already had a quite hacky support for simd. The last simd usages then would be the aggregation kernels.

Loading
Loading