Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Optimized null_count #442

Merged
merged 4 commits into from
Sep 26, 2021
Merged

Conversation

ritchie46
Copy link
Collaborator

Every time we slice an array, we count the null values. This PR does a small optimization so that we only count he null values of the smallest chunk of memory.

@codecov
Copy link

codecov bot commented Sep 23, 2021

Codecov Report

Merging #442 (2a6093d) into main (235b7f5) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #442   +/-   ##
=======================================
  Coverage   79.87%   79.87%           
=======================================
  Files         371      371           
  Lines       22753    22758    +5     
=======================================
+ Hits        18174    18178    +4     
- Misses       4579     4580    +1     
Impacted Files Coverage Δ
src/bitmap/immutable.rs 86.11% <100.00%> (+1.03%) ⬆️
src/compute/arithmetics/time.rs 44.89% <0.00%> (-2.05%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 235b7f5...2a6093d. Read the comment docs.

@jorgecarleitao
Copy link
Owner

Cool idea!

I think it is worth benchmarking: we now have 2 (smaller) iterations instead of one. I can PR a bench for slicing bitmaps.

@ritchie46
Copy link
Collaborator Author

I think it is worth benchmarking: we now have 2 (smaller) iterations instead of one. I can PR a bench for slicing bitmaps.

Yes, I was also wondering this. Perhaps there is some more instruction level parallelism as they are independent. But you are right. Let's benchmark

@ritchie46
Copy link
Collaborator Author

ritchie46 commented Sep 23, 2021

I ran this benchmark:

        let offset = ((size as f64) * 0.2) as usize;
        let len = ((size as f64) * 0.55) as usize;

        c.bench_function(&format!("bitmap_count_zeros {}", log2_size), |b| {
            b.iter(|| {
                let r = bitmap.clone().slice(offset, len);
                assert!(r.null_count() > 0);
            })
        });

Worst case side of the spectrum

So that we have a smaller chunk at the start and at the end of the array. We slice only 55% as this would be an almost worst case scenario (51% would be). So the result would be better if we slice bigger chunks. It seems we make a difference once we don't fit the cache anymore.

bitmap_count_zeros 10   time:   [31.303 ns 31.316 ns 31.329 ns]                                   
                        change: [+2.2863% +3.3347% +4.2056%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild
  2 (2.00%) high severe

bitmap_count_zeros 12   time:   [46.442 ns 46.496 ns 46.555 ns]                                   
                        change: [+13.943% +14.534% +15.153%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

bitmap_count_zeros 14   time:   [91.293 ns 91.322 ns 91.370 ns]                                  
                        change: [-14.881% -14.622% -14.426%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  4 (4.00%) high severe

bitmap_count_zeros 16   time:   [304.45 ns 305.34 ns 306.20 ns]                                  
                        change: [-14.044% -13.760% -13.461%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 21 outliers among 100 measurements (21.00%)
  8 (8.00%) low severe
  1 (1.00%) low mild
  8 (8.00%) high mild
  4 (4.00%) high severe

bitmap_count_zeros 18   time:   [1.1123 us 1.1126 us 1.1130 us]                                   
                        change: [-24.429% -23.034% -21.759%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

bitmap_count_zeros 20   time:   [4.3841 us 4.3865 us 4.3888 us]                                   
                        change: [-19.887% -19.629% -19.434%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Best case side of the spectrum

Here we slice a large chunk (85% of the array)

bitmap_count_zeros 10   time:   [34.895 ns 34.906 ns 34.918 ns]                                   
                        change: [+9.0526% +9.3320% +9.5184%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

bitmap_count_zeros 12   time:   [37.946 ns 37.990 ns 38.037 ns]                                   
                        change: [-27.212% -26.650% -26.237%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 18 outliers among 100 measurements (18.00%)
  7 (7.00%) low mild
  7 (7.00%) high mild
  4 (4.00%) high severe

bitmap_count_zeros 14   time:   [55.512 ns 55.536 ns 55.567 ns]                                  
                        change: [-63.246% -63.169% -63.098%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

bitmap_count_zeros 16   time:   [121.03 ns 121.45 ns 122.02 ns]                                  
                        change: [-79.464% -79.020% -78.621%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 18 outliers among 100 measurements (18.00%)
  1 (1.00%) low severe
  7 (7.00%) low mild
  5 (5.00%) high mild
  5 (5.00%) high severe

bitmap_count_zeros 18   time:   [399.81 ns 399.89 ns 400.01 ns]                                  
                        change: [-82.211% -82.129% -82.045%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  9 (9.00%) high severe

bitmap_count_zeros 20   time:   [1.4925 us 1.4930 us 1.4937 us]                                   
                        change: [-82.449% -82.356% -82.301%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) low mild
  2 (2.00%) high mild
  5 (5.00%) high severe

@jorgecarleitao jorgecarleitao changed the title small null_count optimization Optimized null_count Sep 23, 2021
@jorgecarleitao jorgecarleitao added the enhancement An improvement to an existing feature label Sep 23, 2021
Copy link
Owner

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome; very good result.

It may be worth creating a bench that you used, in case someone else would like to use it for future improvements.

@ritchie46
Copy link
Collaborator Author

It may be worth creating a bench that you used, in case someone else would like to use it for future improvements.

Added 👍

@jorgecarleitao
Copy link
Owner

Could you resolve the conflict?

@ritchie46
Copy link
Collaborator Author

Could you resolve the conflict?

Good to go.

@jorgecarleitao jorgecarleitao merged commit e27ff27 into jorgecarleitao:main Sep 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement An improvement to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants