feat(rust, python): Add Run-length Encoding functions #9826

magarick · 2023-07-12T07:07:26Z

Adds rle and rle_id.
Closes #9328

ritchie46 · 2023-07-12T08:07:24Z

py-polars/polars/expr/expr.py

+        """
+        return self._from_pyexpr(self._pyexpr.rle())
+
+    def rleid(self) -> Self:


Following our naming convention this should be rle_id.

So much extra typing! But ok.

Tiny violin ;)

cmdlineluser · 2023-07-12T10:56:41Z

I wasn't aware of the actual name until someone let me know in the replies but I believe this will close: #9328

+1

ritchie46 · 2023-07-12T11:24:08Z

polars/polars-ops/src/series/ops/rle.rs

+pub fn rle_id(s: &Series) -> PolarsResult<Series> {
+    let (s1, s2) = (s.slice(0, s.len() - 1), s.slice(1, s.len()));
+    // Run numbers start at zero
+    Ok(std::iter::once(false)


We can speed this up by not using chains and flattens, but directly extend into a preallocated Vec.

This will save a lot of branches and ensures we only allocate once.

I thought the iterators would be efficient, but I guess you have to allocate somewhere. And your version is much simpler to read.

The chain iterator is force to branch on both iterators. This branch can block other optimizations. And the iterators lenght becomes unknown, which leads to wrong allocations sizes and memcopys on realloc.

Oh, gross. I didn't know that. I guess iterators aren't as "free" as they're made out to be.

ritchie46 · 2023-07-12T11:25:00Z

Thank you @magarick. I love the functionality and the PR looks good. I have a comment on the implementation of rle_id. That can be faster.

ritchie46 · 2023-07-12T17:17:08Z

polars/polars-ops/src/series/ops/rle.rs

+    s_neq
+        .into_iter()
+        .enumerate()
+        .for_each(|(i, v)| out.push(out[i] + v.unwrap() as u32));


Can we keep a latest_value here? out[i] requires to keep a register for i and we do a bound check on out. The bound check may be elided by the compiler, but if we write it differently we are sure that there is no bound check.

A second optimization is that we can iterate over the arrays in s_neq by calling downcast_iter. Then we get BooleanArray. From those we can directly use the values_iter. This saves a few branches in the iterator and the unwrap itself.

OK, I tried something like that. It now dumps the inequality checks directly into the vector and does a cumsum in place.
It always feels a little awkward to iterate over each chunk but I guess that's what's really underneath so that's what has to be done.

It always feels a little awkward to iterate over each chunk but I guess that's what's really underneath so that's what has to be done.

Yes, then the code is closer to what we have in memory and has to go through less abstractions between the get_value from here and write_value there.

ritchie46 · 2023-07-12T18:05:28Z

polars/polars-ops/src/series/ops/rle.rs

-        .for_each(|(i, v)| out.push(out[i] + v.unwrap() as u32));
+        .downcast_iter()
+        .for_each(|a| out.extend(a.values_iter().map(|v| v as u32)));
+    out.iter_mut().fold(0, |a, x| {


Almost there. :)

I think we can do this in a single pass. Something like this:

// keep track of last written value let mut last_value = 0u32; s_neq .downcast_iter() .for_each(|a| { let iter = a.values_iter(); for v in iter { let v = v as u32; out.push(last_value + v ); last_value = v; } });

And I thought I was into micro-optimizations :-)
Ideally we could do everything, even the comparisons, in a single pass and only store what we need but I didn't see a clear way to do that.

Ideally we could do everything, even the comparisons, in a single pass and only store what we need but I didn't see a clear way to do that.

Yes, that would require us to go down into the known types with some generics. We could follow up with that. The benefit of this implementation is that it has little compiler bloat.

And I thought I was into micro-optimizations :-)

Haha, I have put a lot of time in making ensuring what we advoncate. A fast dataframe library. I cannot unsee the potential branches, cache misses and allocations ^^

ritchie46 · 2023-07-12T18:56:12Z

Great functionality to have @magarick and thanks for quickly iterating the perf improvements. 🙌

alexander-beedie · 2023-07-13T09:26:49Z

Nice one :)

Add RLE functions

b2eb710

magarick requested review from ritchie46, stinodego and alexander-beedie as code owners July 12, 2023 07:07

github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Jul 12, 2023

magarick added 2 commits July 12, 2023 00:10

blah

f26d7a9

clippy

3784d77

ritchie46 reviewed Jul 12, 2023

View reviewed changes

_

7a48193

ritchie46 requested changes Jul 12, 2023

View reviewed changes

magarick added 2 commits July 12, 2023 09:56

prealloc rle_id

e339879

fmt

234afcc

ritchie46 reviewed Jul 12, 2023

View reviewed changes

magarick added 2 commits July 12, 2023 10:54

try to optimize

d696739

try to optimize

6b2a491

ritchie46 reviewed Jul 12, 2023

View reviewed changes

faster

2761be4

ritchie46 merged commit b87ff01 into pola-rs:main Jul 12, 2023

magarick deleted the rle branch July 12, 2023 20:28

c-peters pushed a commit to c-peters/polars that referenced this pull request Jul 14, 2023

feat(rust, python): Add Run-length Encoding functions (pola-rs#9826)

cf1d717

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(rust, python): Add Run-length Encoding functions #9826

feat(rust, python): Add Run-length Encoding functions #9826

magarick commented Jul 12, 2023 •

edited by alexander-beedie

Loading

ritchie46 Jul 12, 2023

magarick Jul 12, 2023

ritchie46 Jul 12, 2023

cmdlineluser commented Jul 12, 2023

ritchie46 Jul 12, 2023

magarick Jul 12, 2023

ritchie46 Jul 12, 2023

magarick Jul 12, 2023

ritchie46 commented Jul 12, 2023

ritchie46 Jul 12, 2023 •

edited

Loading

magarick Jul 12, 2023

ritchie46 Jul 12, 2023

ritchie46 Jul 12, 2023 •

edited

Loading

magarick Jul 12, 2023

ritchie46 Jul 12, 2023

ritchie46 commented Jul 12, 2023

alexander-beedie commented Jul 13, 2023

feat(rust, python): Add Run-length Encoding functions #9826

feat(rust, python): Add Run-length Encoding functions #9826

Conversation

magarick commented Jul 12, 2023 • edited by alexander-beedie Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmdlineluser commented Jul 12, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ritchie46 commented Jul 12, 2023

ritchie46 Jul 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ritchie46 Jul 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ritchie46 commented Jul 12, 2023

alexander-beedie commented Jul 13, 2023

magarick commented Jul 12, 2023 •

edited by alexander-beedie

Loading

ritchie46 Jul 12, 2023 •

edited

Loading

ritchie46 Jul 12, 2023 •

edited

Loading