-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support snappy de/compress_raw_into #40
Conversation
pub fn decompress_vec(&mut self, input: &[u8]) -> Result<Vec<u8>> {
let mut buf = vec![0; decompress_len(input)?];
let n = self.decompress(input, &mut buf)?;
buf.truncate(n);
Ok(buf) it seems like you can improve a little if you know the exact length, since you don't need to calculate it nor truncate. |
Truncating is a trivial cost, (the main cost of allocation has already occurred), and decompress_len seems equally cheap. Maybe you'd be interested in this issue thread and specifically, this comment by the maintainer of snap I'd guess if you have a compelling case for better |
For my usual test ( Snappy is the holdout, and there are some of the benchmarks that show this too (fireworks, paper100k). Is there any way to find out why the difference? I will comment on that thread, hope to clear things up. |
I particular, why the massive difference between framed and not framed? Is rust-snap deciding not to compress the data in this case? |
I believe the reason is, given the current underlying API, there is a new allocation that the raw de/compression outputs. Whereas with framed, it accepts Read/Write objects read/write module respectively, There, we can do a single allocation for the output right into Python. I think that's where the extra cost is coming from. |
^ can we turn this into a concrete feature request at rust-snap? |
Alrighty, given the suggestions there, I think we got some slight improvement; only it admittedly doesn't support sending in the When integrating your standard 'oh beautiful day' test data, I still noted python-snappy is significantly faster; so I made some random test data, and the comparison is much closer. 🤔 But wow, that framed format comparison is decisive.. too bad you weren't wanting to switch from using of that in python-snappy. 😜 I'm not really sure what else can be done here. |
Well, raw is still faster for both libraries. I still find it hard to believe that cramjam is slower than py-snap for raw but faster for frame, and not just by a small amount. I might go ahead and do the work to swap out the compressions in fastparquet for cramjam except snappy and lzo ( #41 ). I haven't looked at the new benchmarks yet, so maybe snappy too. |
I guess even if the speed isn't what you're looking for right now, users would still get the benefit of an easier install at least; which was the driving motivation for this project to begin with, not necessarily faster than other libs, but at least competitive. With that said, I'm still very interested in pursuing making things faster. 👍 |
I wonder, do you have an inkling of where any further improvement might come from? |
I tried statically building/linking to the original implementation, both from C++ and C APIs and both framed and raw format were slightly slower, so ya, at a loss right now. Maybe the official distribution has more optimized compilation? In short, I'm out of ideas for the moment, but it's a fun puzzle. 😅 |
Part of #33.
New functions:
cramjam.snappy.compress_raw_into
cramjam.snappy.decompress_raw_into
cramjam.snappy.compress_raw_max_len
cramjam.snappy.decompress_raw_len