Specialize interleave string ~2-3x faster #2944

tustvold · 2022-10-26T23:39:30Z

Which issue does this PR close?

Part of #2864

Rationale for this change

interleave str(20, 0.0) 100 [0..100, 100..230, 450..1000]
                        time:   [1.0792 µs 1.0795 µs 1.0799 µs]
                        change: [-60.193% -60.122% -60.019%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

interleave str(20, 0.0) 400 [0..100, 100..230, 450..1000]
                        time:   [3.2179 µs 3.2187 µs 3.2197 µs]
                        change: [-60.739% -60.622% -60.497%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) low mild
  2 (2.00%) high mild
  5 (5.00%) high severe

interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000]
                        time:   [7.4057 µs 7.4079 µs 7.4101 µs]
                        change: [-60.606% -60.542% -60.442%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

interleave str(20, 0.0) 1024 [0..100, 100..230, 450..1000, 0..1000]
                        time:   [7.4336 µs 7.4369 µs 7.4403 µs]
                        change: [-58.701% -58.630% -58.525%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  7 (7.00%) high mild
  2 (2.00%) high severe

interleave str(20, 0.5) 100 [0..100, 100..230, 450..1000]
                        time:   [1.6810 µs 1.6815 µs 1.6821 µs]
                        change: [-58.572% -58.434% -58.308%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  2 (2.00%) high severe

interleave str(20, 0.5) 400 [0..100, 100..230, 450..1000]
                        time:   [5.2158 µs 5.2246 µs 5.2341 µs]
                        change: [-64.090% -63.972% -63.864%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000]
                        time:   [16.993 µs 17.013 µs 17.032 µs]
                        change: [-54.202% -54.077% -53.991%] (p = 0.00 < 0.05)
                        Performance has improved.

interleave str(20, 0.5) 1024 [0..100, 100..230, 450..1000, 0..1000]
                        time:   [17.424 µs 17.443 µs 17.464 µs]
                        change: [-52.353% -52.256% -52.128%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

What changes are included in this PR?

Adds a specialized implementation for string arrays

Are there any user-facing changes?

No

tustvold · 2022-10-27T05:21:30Z

With #2947 this will be generalizable to also cover byte arrays

alamb

LGTM -- thank you

alamb · 2022-10-27T17:17:38Z

arrow-array/src/array/string_array.rs

-            (*end - *start).to_usize().unwrap(),
-        );
+        let slice =
+            std::slice::from_raw_parts(self.value_data.as_ptr().add(start), end - start);


this is a drive by cleanup right?

alamb · 2022-10-27T17:18:44Z

arrow-select/src/interleave.rs

        _ => interleave_fallback(values, indices)
    }
 }

+/// Common functionality for interleaving arrays
+struct Interleave<'a, T> {


some comments might help here specifically what null_count and nulls represent and what the generic T is used for

alamb · 2022-10-27T17:19:46Z

arrow-select/src/interleave.rs

-            as_primitive_array::<T>(*x)
-        })
-        .collect();
+    let interleaved = Interleave::<'_, PrimitiveArray<T>>::new(values, indices);


alamb · 2022-10-27T17:22:45Z

arrow-select/src/interleave.rs

+    offsets.append(O::from_usize(0).unwrap());
+    for (a, b) in indices {
+        let o = interleaved.arrays[*a].value_offsets();
+        let len = o[*b + 1].as_usize() - o[*b].as_usize();


Suggested change

let len = o[*b + 1].as_usize() - o[*b].as_usize();

// element length

let len = o[*b + 1].as_usize() - o[*b].as_usize();

alamb · 2022-10-27T17:23:02Z

arrow-select/src/interleave.rs

+    let mut capacity = 0;
+    let mut offsets = BufferBuilder::<O>::new(indices.len() + 1);
+    offsets.append(O::from_usize(0).unwrap());
+    for (a, b) in indices {


this is clever -- do the offsets in one pass and the strings in another

ursabot · 2022-10-27T19:42:01Z

Benchmark runs are scheduled for baseline = d625f0a and contender = 66ea66b. 66ea66b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

tustvold added 2 commits October 27, 2022 11:44

Add interleave string benchmark

cf134af

Specialize interleave strings (apache#2864)

8d7bec2

github-actions bot added the arrow Changes to the arrow crate label Oct 26, 2022

tustvold added a commit to tustvold/arrow-rs that referenced this pull request Oct 27, 2022

Specialize interleave dictionary (apache#2944)

6c13c89

alamb approved these changes Oct 27, 2022

View reviewed changes

Review feedback

2919914

tustvold merged commit 66ea66b into apache:master Oct 27, 2022

tustvold added a commit to tustvold/arrow-rs that referenced this pull request Oct 28, 2022

Specialize interleave dictionary (apache#2944)

61544a5

alamb mentioned this pull request Nov 25, 2022

Specialized Interleave Kernel #2864

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specialize interleave string ~2-3x faster #2944

Specialize interleave string ~2-3x faster #2944

tustvold commented Oct 26, 2022

tustvold commented Oct 27, 2022

alamb left a comment

alamb Oct 27, 2022

alamb Oct 27, 2022

alamb Oct 27, 2022

alamb Oct 27, 2022

alamb Oct 27, 2022

ursabot commented Oct 27, 2022

	let len = o[b + 1].as_usize() - o[b].as_usize();
	// element length
	let len = o[b + 1].as_usize() - o[b].as_usize();

Specialize interleave string ~2-3x faster #2944

Specialize interleave string ~2-3x faster #2944

Conversation

tustvold commented Oct 26, 2022

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

tustvold commented Oct 27, 2022

alamb left a comment

Choose a reason for hiding this comment

alamb Oct 27, 2022

Choose a reason for hiding this comment

alamb Oct 27, 2022

Choose a reason for hiding this comment

alamb Oct 27, 2022

Choose a reason for hiding this comment

alamb Oct 27, 2022

Choose a reason for hiding this comment

alamb Oct 27, 2022

Choose a reason for hiding this comment

ursabot commented Oct 27, 2022