Skip to content
This repository has been archived by the owner on Nov 14, 2024. It is now read-only.

Gzip compression #4311

Merged
merged 18 commits into from
Oct 18, 2019
Merged

Gzip compression #4311

merged 18 commits into from
Oct 18, 2019

Conversation

j-baker
Copy link
Contributor

@j-baker j-baker commented Oct 16, 2019

Update of Murat's PR

Copy link
Contributor

@jeremyk-91 jeremyk-91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass before syncs for this morning.
I think this change generally makes sense, I need to look at GzipCompressingInputStream in more detail

byte[] data = new byte[1_000_000];
fillWithIncompressibleData(data);
assertThat(ByteStreams.toByteArray(LZ4.decompress((GZIP.compress(
new ByteArrayInputStream(data)))))).isEqualTo(data);
}
Copy link
Contributor

@jeremyk-91 jeremyk-91 Oct 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-actionable. The current impl allows it the other way as well. This makes sense for easy migration from one compression method to another (though from no compression is still tricky - I guess you open a new stream store if you want to do that)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah - i'm open to having a CompressionType which adds a single byte for no compression

import com.google.common.io.CountingInputStream;


public final class GzipCompressingInputStream {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read the impl closely here - but I'm guessing there's some reason we can't have this mostly delegate to GZIPOutputStream. Might be good to document that somewhere / how this is different

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to provide atlas with an input stream, and creating an inputstream out of an output stream is tricky (needs a thread to avoid locking up). PipedInputStream and PipedOutputStream help, but I don't want a thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of the documentation - I would be tempted to literally point out that just like, the person who thinks this is probably bogus and tries to reimplement will immediately figure out the issue :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay - yeah I was thinking of hooking them up via the piped streams, but fair enough

Copy link
Contributor

@jeremyk-91 jeremyk-91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked the header/content/trailer logic. Looks right

import com.google.common.io.CountingInputStream;


public final class GzipCompressingInputStream {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay - yeah I was thinking of hooking them up via the piped streams, but fair enough

@j-baker j-baker merged commit 8486219 into develop Oct 18, 2019
@delete-merged-branch delete-merged-branch bot deleted the gzip_compression branch October 18, 2019 14:06
@svc-autorelease
Copy link
Collaborator

Released 0.162.3

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants