Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-38452: [C++][Benchmark] Adding benchmark for LZ4/Snappy Compression #38453

Merged
merged 2 commits into from
Oct 25, 2023

Conversation

mapleFU
Copy link
Member

@mapleFU mapleFU commented Oct 25, 2023

Rationale for this change

This patch add LZ4 (LZ4_RAW in Parquet standard) and Snappy compression/decompression benchmark.

What changes are included in this PR?

Add groups of benchmarks.

Are these changes tested?

no

Are there any user-facing changes?

no

@mapleFU mapleFU requested a review from pitrou October 25, 2023 07:43
@github-actions
Copy link

⚠️ GitHub issue #38452 has been automatically assigned in GitHub to PR creator.

@pitrou
Copy link
Member

pitrou commented Oct 25, 2023

I get the following error with this PR:

/home/antoine/arrow/dev/cpp/src/arrow/result.cc:28: ValueOrDie called on an error: NotImplemented: Streaming compression unsupported with LZ4 raw format. Try using LZ4 frame format instead.

@mapleFU
Copy link
Member Author

mapleFU commented Oct 25, 2023

This is my fault, I just add the code without checking.

I've re-check the logic here. The LZ4 and Snappy doesn't support Streaming ( probabily this is the reason origin tests doesn't add them). Now I've remove the streaming for these two Codec.

@pitrou
Copy link
Member

pitrou commented Oct 25, 2023

For the record, benchmark numbers here:

ReferenceStreamingCompression<Compression::GZIP>         170571860 ns    169802391 ns            4 bytes_per_second=47.1136M/s ratio=6.95102
ReferenceCompression<Compression::GZIP>                  171569422 ns    171399551 ns            4 bytes_per_second=46.6746M/s ratio=6.95102
ReferenceStreamingDecompression<Compression::GZIP>        12079114 ns     12075244 ns           58 bytes_per_second=662.513M/s ratio=6.95102
ReferenceDecompression<Compression::GZIP>                 11983466 ns     11926139 ns           58 bytes_per_second=670.795M/s ratio=6.95102

ReferenceStreamingCompression<Compression::BROTLI>       254898609 ns    253547102 ns            3 bytes_per_second=31.5523M/s ratio=8.31174
ReferenceCompression<Compression::BROTLI>                258946902 ns    257261268 ns            3 bytes_per_second=31.0968M/s ratio=8.31175
ReferenceStreamingDecompression<Compression::BROTLI>       8415786 ns      8343701 ns           83 bytes_per_second=958.807M/s ratio=8.31174
ReferenceDecompression<Compression::BROTLI>                7481414 ns      7443618 ns           94 bytes_per_second=1074.75M/s ratio=8.31175

ReferenceStreamingCompression<Compression::ZSTD>          17918513 ns     17833898 ns           39 bytes_per_second=448.584M/s ratio=6.876
ReferenceCompression<Compression::ZSTD>                   16608187 ns     16545738 ns           42 bytes_per_second=483.508M/s ratio=6.8771
ReferenceStreamingDecompression<Compression::ZSTD>         5673372 ns      5646730 ns          123 bytes_per_second=1.38354G/s ratio=6.876
ReferenceDecompression<Compression::ZSTD>                  5227017 ns      5202190 ns          132 bytes_per_second=1.50177G/s ratio=6.8771

ReferenceStreamingCompression<Compression::LZ4_FRAME>     14577950 ns     14570210 ns           47 bytes_per_second=549.066M/s ratio=3.52824
ReferenceCompression<Compression::LZ4_FRAME>              12065500 ns     12007357 ns           58 bytes_per_second=666.258M/s ratio=3.52824
ReferenceStreamingDecompression<Compression::LZ4_FRAME>    2008312 ns      1998786 ns          349 bytes_per_second=3.90862G/s ratio=3.52824
ReferenceDecompression<Compression::LZ4_FRAME>             1983992 ns      1972508 ns          355 bytes_per_second=3.96069G/s ratio=3.52824

ReferenceCompression<Compression::LZ4>                    11887179 ns     11835535 ns           59 bytes_per_second=675.931M/s ratio=3.53112
ReferenceDecompression<Compression::LZ4>                   1935571 ns      1924247 ns          363 bytes_per_second=4.06003G/s ratio=3.53112

ReferenceCompression<Compression::SNAPPY>                 11624693 ns     11564830 ns           61 bytes_per_second=691.752M/s ratio=3.58312
ReferenceDecompression<Compression::SNAPPY>                7049123 ns      7016010 ns          100 bytes_per_second=1.11352G/s ratio=3.58312

@pitrou pitrou merged commit 73589dd into apache:main Oct 25, 2023
34 of 35 checks passed
@pitrou pitrou removed the awaiting review Awaiting review label Oct 25, 2023
@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Oct 25, 2023
@wgtmac
Copy link
Member

wgtmac commented Oct 25, 2023

For the record, benchmark numbers here:

ReferenceStreamingCompression<Compression::GZIP>         170571860 ns    169802391 ns            4 bytes_per_second=47.1136M/s ratio=6.95102
ReferenceCompression<Compression::GZIP>                  171569422 ns    171399551 ns            4 bytes_per_second=46.6746M/s ratio=6.95102
ReferenceStreamingDecompression<Compression::GZIP>        12079114 ns     12075244 ns           58 bytes_per_second=662.513M/s ratio=6.95102
ReferenceDecompression<Compression::GZIP>                 11983466 ns     11926139 ns           58 bytes_per_second=670.795M/s ratio=6.95102

ReferenceStreamingCompression<Compression::BROTLI>       254898609 ns    253547102 ns            3 bytes_per_second=31.5523M/s ratio=8.31174
ReferenceCompression<Compression::BROTLI>                258946902 ns    257261268 ns            3 bytes_per_second=31.0968M/s ratio=8.31175
ReferenceStreamingDecompression<Compression::BROTLI>       8415786 ns      8343701 ns           83 bytes_per_second=958.807M/s ratio=8.31174
ReferenceDecompression<Compression::BROTLI>                7481414 ns      7443618 ns           94 bytes_per_second=1074.75M/s ratio=8.31175

ReferenceStreamingCompression<Compression::ZSTD>          17918513 ns     17833898 ns           39 bytes_per_second=448.584M/s ratio=6.876
ReferenceCompression<Compression::ZSTD>                   16608187 ns     16545738 ns           42 bytes_per_second=483.508M/s ratio=6.8771
ReferenceStreamingDecompression<Compression::ZSTD>         5673372 ns      5646730 ns          123 bytes_per_second=1.38354G/s ratio=6.876
ReferenceDecompression<Compression::ZSTD>                  5227017 ns      5202190 ns          132 bytes_per_second=1.50177G/s ratio=6.8771

ReferenceStreamingCompression<Compression::LZ4_FRAME>     14577950 ns     14570210 ns           47 bytes_per_second=549.066M/s ratio=3.52824
ReferenceCompression<Compression::LZ4_FRAME>              12065500 ns     12007357 ns           58 bytes_per_second=666.258M/s ratio=3.52824
ReferenceStreamingDecompression<Compression::LZ4_FRAME>    2008312 ns      1998786 ns          349 bytes_per_second=3.90862G/s ratio=3.52824
ReferenceDecompression<Compression::LZ4_FRAME>             1983992 ns      1972508 ns          355 bytes_per_second=3.96069G/s ratio=3.52824

ReferenceCompression<Compression::LZ4>                    11887179 ns     11835535 ns           59 bytes_per_second=675.931M/s ratio=3.53112
ReferenceDecompression<Compression::LZ4>                   1935571 ns      1924247 ns          363 bytes_per_second=4.06003G/s ratio=3.53112

ReferenceCompression<Compression::SNAPPY>                 11624693 ns     11564830 ns           61 bytes_per_second=691.752M/s ratio=3.58312
ReferenceDecompression<Compression::SNAPPY>                7049123 ns      7016010 ns          100 bytes_per_second=1.11352G/s ratio=3.58312

It seems that streaming (de)compression is generally slower than the non-streaming parity.

@pitrou
Copy link
Member

pitrou commented Oct 25, 2023

There may be some memory allocation costs for streaming (de)compression, while non-streaming is stateless.

@mapleFU mapleFU deleted the benchmark/catch-up-for-compression branch October 25, 2023 19:54
JerAguilon pushed a commit to JerAguilon/arrow that referenced this pull request Oct 25, 2023
…ression (apache#38453)

### Rationale for this change

This patch add LZ4 (LZ4_RAW in Parquet standard) and Snappy compression/decompression benchmark.

### What changes are included in this PR?

Add groups of benchmarks.

### Are these changes tested?

no

### Are there any user-facing changes?

no

* Closes: apache#38452

Authored-by: mwish <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 73589dd.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 3 possible false positives for unstable benchmarks that are known to sometimes produce them.

loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…ression (apache#38453)

### Rationale for this change

This patch add LZ4 (LZ4_RAW in Parquet standard) and Snappy compression/decompression benchmark.

### What changes are included in this PR?

Add groups of benchmarks.

### Are these changes tested?

no

### Are there any user-facing changes?

no

* Closes: apache#38452

Authored-by: mwish <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…ression (apache#38453)

### Rationale for this change

This patch add LZ4 (LZ4_RAW in Parquet standard) and Snappy compression/decompression benchmark.

### What changes are included in this PR?

Add groups of benchmarks.

### Are these changes tested?

no

### Are there any user-facing changes?

no

* Closes: apache#38452

Authored-by: mwish <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++][Benchmark] Adding benchmark for LZ4/Snappy Compression
3 participants