Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support snappy de/compress_raw_into #40

Merged
merged 2 commits into from
Feb 25, 2021
Merged

Support snappy de/compress_raw_into #40

merged 2 commits into from
Feb 25, 2021

Conversation

milesgranger
Copy link
Owner

Part of #33.

New functions:
cramjam.snappy.compress_raw_into
cramjam.snappy.decompress_raw_into
cramjam.snappy.compress_raw_max_len
cramjam.snappy.decompress_raw_len

@martindurant
Copy link

    pub fn decompress_vec(&mut self, input: &[u8]) -> Result<Vec<u8>> {
        let mut buf = vec![0; decompress_len(input)?];
        let n = self.decompress(input, &mut buf)?;
        buf.truncate(n);
        Ok(buf)

it seems like you can improve a little if you know the exact length, since you don't need to calculate it nor truncate.

@milesgranger
Copy link
Owner Author

milesgranger commented Feb 25, 2021

Truncating is a trivial cost, (the main cost of allocation has already occurred), and decompress_len seems equally cheap.

Maybe you'd be interested in this issue thread and specifically, this comment by the maintainer of snap I'd guess if you have a compelling case for better raw support one could bring it up with that lib and I'd be happy to expose into Python. 👍

@martindurant
Copy link

For my usual test (b"oh what a beautiful morning, oh what a beautiful day!!" * 5000000), I am finding cramjam.snappy.decompress_raw 331 ms versus snappy.decompress 152 ms, whether bytes or bytearray :(

Snappy is the holdout, and there are some of the benchmarks that show this too (fireworks, paper100k). Is there any way to find out why the difference?

I will comment on that thread, hope to clear things up.

@martindurant
Copy link

Snappy is the holdout, and there are some of the benchmarks that show this too (fireworks, paper100k). Is there any way to find out why the difference?

I particular, why the massive difference between framed and not framed? Is rust-snap deciding not to compress the data in this case?

@milesgranger
Copy link
Owner Author

I believe the reason is, given the current underlying API, there is a new allocation that the raw de/compression outputs. Whereas with framed, it accepts Read/Write objects read/write module respectively, There, we can do a single allocation for the output right into Python. I think that's where the extra cost is coming from.

@martindurant
Copy link

^ can we turn this into a concrete feature request at rust-snap?

@milesgranger
Copy link
Owner Author

Alrighty, given the suggestions there, I think we got some slight improvement; only it admittedly doesn't support sending in the output_len, we anyways know the decompress len and for bytearray and can make the one allocation to Python and resize after compression; so the benchmarks are even closer now for the raw side of things.

When integrating your standard 'oh beautiful day' test data, I still noted python-snappy is significantly faster; so I made some random test data, and the comparison is much closer. 🤔

But wow, that framed format comparison is decisive.. too bad you weren't wanting to switch from using of that in python-snappy. 😜

I'm not really sure what else can be done here.

@martindurant
Copy link

But wow, that framed format comparison is decisive.. too bad you weren't wanting to switch from using of that in python-snappy. 😜

Well, raw is still faster for both libraries. I still find it hard to believe that cramjam is slower than py-snap for raw but faster for frame, and not just by a small amount.

I might go ahead and do the work to swap out the compressions in fastparquet for cramjam except snappy and lzo ( #41 ). I haven't looked at the new benchmarks yet, so maybe snappy too.

@milesgranger milesgranger merged commit 18f6918 into master Feb 25, 2021
@milesgranger milesgranger deleted the snappy-raw-into branch February 25, 2021 20:59
@milesgranger
Copy link
Owner Author

I guess even if the speed isn't what you're looking for right now, users would still get the benefit of an easier install at least; which was the driving motivation for this project to begin with, not necessarily faster than other libs, but at least competitive.

With that said, I'm still very interested in pursuing making things faster. 👍

@martindurant
Copy link

I'm still very interested in pursuing making things faster.

I wonder, do you have an inkling of where any further improvement might come from?

@milesgranger
Copy link
Owner Author

milesgranger commented Mar 11, 2021

I tried statically building/linking to the original implementation, both from C++ and C APIs and both framed and raw format were slightly slower, so ya, at a loss right now. Maybe the official distribution has more optimized compilation?

In short, I'm out of ideas for the moment, but it's a fun puzzle. 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants