Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for zstd compressed archives #36754

Closed
wants to merge 1 commit into from

Conversation

mrueg
Copy link
Contributor

@mrueg mrueg commented Apr 1, 2018

- What I did
Added support for zstd compression (https://github.com/facebook/zstd) Issue: #28394
- How I did it
Added it to archive, which hopefully should be enough to support it. This will add a requirement on zstd to be installed on your system.
- How to verify it

  • Create a tar.zst archive: touch foo; tar -c -I zstd -f foo.tar.zst foo
  • Create a Dockerfile that includes ADD foo.zst .
  • The archive file should be uncompressed within the docker container

- Description for the changelog
Add support for zstd compressed archives

@GordonTheTurtle GordonTheTurtle added dco/no Automatically set by a bot when one of the commits lacks proper signature status/0-triage labels Apr 1, 2018
@mrueg mrueg force-pushed the 28394-zstd-support branch from 55454d7 to 98963c6 Compare April 1, 2018 00:04
@GordonTheTurtle GordonTheTurtle added the dco/no Automatically set by a bot when one of the commits lacks proper signature label Apr 1, 2018
@mrueg mrueg force-pushed the 28394-zstd-support branch from 98963c6 to 93d2cbb Compare April 1, 2018 00:05
@GordonTheTurtle GordonTheTurtle removed the dco/no Automatically set by a bot when one of the commits lacks proper signature label Apr 1, 2018
@mrueg mrueg force-pushed the 28394-zstd-support branch 2 times, most recently from 0193ef6 to 4846244 Compare April 1, 2018 00:40
@mrueg mrueg requested a review from vdemeester as a code owner April 1, 2018 00:40
@codecov
Copy link

codecov bot commented Apr 1, 2018

Codecov Report

Merging #36754 into master will increase coverage by 0.01%.
The diff coverage is 78.57%.

@@            Coverage Diff             @@
##           master   #36754      +/-   ##
==========================================
+ Coverage   34.93%   34.94%   +0.01%     
==========================================
  Files         614      614              
  Lines       45645    45665      +20     
==========================================
+ Hits        15944    15959      +15     
- Misses      27607    27611       +4     
- Partials     2094     2095       +1

@mrueg mrueg force-pushed the 28394-zstd-support branch 2 times, most recently from c43b856 to 2ba540c Compare April 1, 2018 00:59
@stevvooe
Copy link
Contributor

stevvooe commented Apr 2, 2018

What's the goal here? This will allow the archive package to decompress these streams, but who is pushing them and what is the impact there? This effectively creates a new layer format and we'd need a new mediatype to represent that.

@stevvooe stevvooe closed this Apr 2, 2018
@stevvooe stevvooe reopened this Apr 2, 2018
@mrueg
Copy link
Contributor Author

mrueg commented Apr 2, 2018

@stevvooe the goal of this PR is to make it possible to use enable ADD to unpack tar.zst automatically.

touch foo; tar -c -I zstd -f foo.tar.zst foo
Dockerfile:

FROM busybox
ADD foo.tar.zst /

Without this PR, the container will just include a foo.tar.zst. With the PR the tarball will be unpacked.
I want to use zstd-compressed tarballs to onboard my base containers, this helps me to simplify the Dockerfile without running multistage-builds where the first container unpacks and the second one than gets it copied over.

See the latest release for some benchmarks and advantages of zstd against others:
https://github.com/facebook/zstd/releases/tag/v1.3.4

If this adds more than I thought it would, please let me know what needs to be changed to get this functionality into it.

@@ -153,9 +160,22 @@ func TestCompressStreamBzip2Unsupported(t *testing.T) {
}
defer dest.Close()

_, err = CompressStream(dest, Xz)
_, err = CompressStream(dest, Bzip2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change necessary for this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary for this PR, but a drive-by change since I was adding more tests. I can add it in to a separate PR if that's preferred.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do. CC me and we can get that clean up merged!

@stevvooe
Copy link
Contributor

stevvooe commented Apr 2, 2018

If this adds more than I thought it would, please let me know what needs to be changed to get this functionality into it.

Because of the way we implement layer decompression, this may allow for use of zstd in the image format.

If the purpose here is build, then I see no problem with introducing this.

cc @tonistiigi

@mrueg mrueg mentioned this pull request Apr 2, 2018
@mrueg mrueg force-pushed the 28394-zstd-support branch from 2ba540c to 71c8d37 Compare April 2, 2018 23:23
@mrueg
Copy link
Contributor Author

mrueg commented Apr 2, 2018

My purpose is build-only. Although I guess it might be worth considering it in the future for the image format as well. I split the unrelated changes into a separate PR.

@thaJeztah
Copy link
Member

I'm not really in favour of this; ADD is already quite overloaded with "magic", and this would only add more, and adding a new dependency for Docker to run.

This would only benefit adding a local archive, only to decompress it after adding to the image; could you explain the use-case a bit more? i.e. why keep a compressed archive, but using the content of the archive in the image (instead of having the files uncompressed in the build context?)

@mrueg
Copy link
Contributor Author

mrueg commented Apr 12, 2018

In general I'd expect zstd becoming a somewhat more common format in the next couple of years, it's just a catch22 to provide support for it first vs. its consumers. Tar added support recently: http://git.savannah.gnu.org/cgit/tar.git/commit/?id=3d45373d3b192d7342d49524193497598818d36d

The system I use to creates zstd-compressed tarballs (and xz or gz, but zstd has a time/size sweetspot the others don't have), which I use to bootstrap a base container with.
As mentioned above, with this change I can build a docker image like this, without having to unpack it before in the build context by adding shell scripts around it to do that or use a container including zstd with a multistage build.

FROM SCRATCH
ADD bootstrap.tar.zst /

@thaJeztah
Copy link
Member

Discussing this with other maintainers, and at this point we don't think we should add this feature. I did also discuss this with @tonistiigi, and after we fully integrate buildkit (https://github.com/moby/buildkit), adding support for custom archive formats would be possible by using a different/custom front end.

For reasons above, I'm closing this PR, but thanks for contributing, and feel free to continue the conversation 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants