Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

media-types: Define layer media types with and without '+gzip' #388

Merged
merged 3 commits into from
Jan 25, 2017

Conversation

wking
Copy link
Contributor

@wking wking commented Oct 15, 2016

This is a version of #332 that doesn't define a structured syntax suffix (and just defines layer types with and without +gzip). I prefer #332, because it automatically provides compressed and uncompressed versions of any existing media type and is exactly the sort of thing that the RFC 6839 approach is useful for. But for reasons that I still don't understand, the maintainer consensus seems to be against involving structured syntax suffixes.

git diff b9173d5..581cc54 shows the current tips of this PR and #332 are largely the same, with this PR being a bit wordier in places that list media types (because it has to list both the bare and +gzip forms) and #332 being a bit wordier where it defines the structured syntax suffix. Currently this PR introduces fewer new words than #332, but I expect that as the number of potentially-compressed media types grows this approach will become more tedious. But we may not see further potentially-compressed media types, and if we do we can always return to the structured syntax suffix approach later.

@jonboulle
Copy link
Contributor

jonboulle commented Oct 17, 2016

LGTM, but I think you have others to convince

Approved with PullApprove

@philips
Copy link
Contributor

philips commented Oct 19, 2016

This Looks Good To Me but, I will defer to those with stronger opinions from #332.

@wking
Copy link
Contributor Author

wking commented Oct 21, 2016

Rebased around #317 and #337 with 581cc54fd97198.

@vbatts
Copy link
Member

vbatts commented Nov 30, 2016

hmmm. I'm more inclined for this one. It would shift around media type variable names for some. And there are no MUSTs around implementations supporting all (though that really seems like not a huge deal to support a media type difference like this).

@@ -28,7 +28,8 @@ Changing it means creating a new derived image, instead of changing the existing
A layer DiffID is a SHA256 digest over the layer's uncompressed tar archive and serialized in the descriptor digest format, e.g., `sha256:a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9`.
Layers must be packed and unpacked reproducibly to avoid changing the layer DiffID, for example by using tar-split to save the tar headers.

NOTE: the DiffID is different than the digest in the manifest list because the manifest digest is taken over the gzipped layer for `application/vnd.oci.image.layer.v1.tar+gzip` types.
The difference between DiffIDs and the layer digests in the [manifest's `layers`](manifest.md#image-manifest-property-descriptions) is that the layer digest is taken over the blob regardless of compression, while the DiffID is taken after removing any compression.
For an `application/vnd.oci.image.layer.tar+gzip` layer, the layer digest is taken over the `application/vnd.oci.image.layer.tar+gzip` content, while the DiffID is take over the `application/vnd.oci.image.layer.tar` content.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not always be a given. A layer digest may be taken over the uncompressed content. The only limitation is that DiffID is always uncompressed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not always be a given.

What may not be given? Are you just talking about the last sentence? It's qualified with “For an application/vnd.oci.image.layer.tar+gzip layer” (although that's missing the v1 added in #347), and I think the rest of that sentence is always true in that case. Of course, there may be layers that are not application/vnd.oci.image.layer.v1.tar+gzip

A layer digest may be taken over the uncompressed content. The only limitation is that DiffID is always uncompressed

Isn't that what I'm saying (e.g. with “regardless of compresson” for layers and “after removing any compression” for DiffIDs)? Which wording in particular do you think is saying something different, and what do you think it's saying. Once we identify the sticky phrasing we can reword it for additional clarity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one side of this statement is true: "the DiffID is take over the application/vnd.oci.image.layer.tar content."

For this PR to make even a remote bit of sense, a layer digest may be taken over application/vnd.oci.image.layer.tar or application/vnd.oci.image.layer.tar+gzip.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this PR to make even a remote bit of sense, a layer digest may be taken over application/vnd.oci.image.layer.tar or application/vnd.oci.image.layer.tar+gzip.

For a generic layer yes, and those types are still not exhaustive because we only SHOULD layer media types. This sentence, on the other hand, is qualified with “For an application/vnd.oci.image.layer.tar+gzip layer”, and in that case the layer hash MUST be taken over the application/vnd.oci.image.layer.tar+gzip content.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking Could you please stop arguing every little point? I would like for this PR to land but you seem to just want to waste everyone's time. Please get this under control or I am going to start closing your PRs so we can focus on changes that are more likely to succeed.

A DiffID is narrow. It is part of the config, in rootfs. A DiffID is only ever taken over the uncompressed content.

A layer digest may be taken over the uncompressed or compressed version of the content. Arguably, this is not relevant to this document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably, this is not relevant to this document.

Perhaps this is a way forward? I've dropped the non-normative paragraph in 63ce6dd46ec151. If you'd rather have an informative paragraph with different wording, please just paste in the wording you'd like to see or file a PR against this branch.

@@ -311,11 +311,10 @@ Any given image is likely to be composed of several of these Image Filesystem Ch
Certain layers, due to legal requirements, may not be regularly distributable.
Typically, such layers are downloaded directly from a distributor but are never uploaded.

Layers that have these restrictions SHOULD be tagged with an alternative mediatype of `application/vnd.oci.image.layer.nondistributable.v1.tar+gzip`.
Layers that have these restrictions SHOULD be tagged with an alternative media type of `application/vnd.oci.image.layer.nondistributable.v1.tar`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or +gzip.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true of many instances where I currently only talk about the uncompressed type (which is one reason why I prefer the structured syntax suffix approach from #332). With a structured syntax suffix off the table, this PR tries to stay clean by only defining the +gzip forms in media-types.md. For example, this section defines a nondistributable, uncompressed layer. If you're wondering what a compressed, nondistributable layer is, you look it up in the media type list, and see that it's a gzipped version of the uncompressed, nondistributable layer (linking to this section). You can also look up the uncompressed, nondistributable layer in the media-type table, and that also links to this section.

If, on the other hand, you're reading this section and think “hmm, I wonder what media type this would be if I gzip it?”, the current state of this PR requires to you to grep or remember that we define +gzip forms for these specific types. If we want to continually remind readers that we define those +gzip types, I'll reluctantly inject whatever boilerplate you need in most locations where we reference an uncompressed layer type; just tell me what you want that boilerplate to be. Maybe (or ['…+gzip'][media-types.md#oci-image-media-types) almost everywhere we list an uncompressed-layer media type? Do you really want that?

- [`application/vnd.oci.image.layer.v1.tar`](layer.md)
- [`application/vnd.oci.image.layer.v1.tar+gzip`](media-types.md#oci-image-media-types)
- [`application/vnd.oci.image.layer.nondistributable.v1.tar`](layer.md#non-distributable-layers)
- [`application/vnd.oci.image.layer.nondistributable.v1.tar+gzip`](media-types.md#oci-image-media-types)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link doesn't look right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What doesn't look right about it? We list both the uncompressed and +gzip layer types there, and define the +gzip form as a gzipped version of the uncompressed layer. I don't think we need to define the +gzip forms anywhere else, although there are other cases where it makes sense to list them as “MUST be supported”.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't it link to layer.md like the one above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because layer.md has nothing to say about gzipping (which happens on top of the layer tar archive described in layer.md):

$ git grep gzip 63ce6ddc -- layer.md
…no hits…

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you link them to a random table? Link to what is contained. For all intents and purposes, a reader can infer that +gzip is nearly the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to what is contained.

I'm having trouble parsing this.

For all intents and purposes, a reader can infer that +gzip is nearly the same.

Why infer anything when we have a normative definition of the +gzip types in media-types.md? Maybe you want me to move that normative description into layers.md?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you want me to move that normative description into layers.md?

In case this was what you were looking for, I've worked it up as a fixup commit with a387698b0105cb. If the changes in b0105cb look good to you, let me know and I'll squash them in. If they don't look like what you want, just tell me what to change (or file a fixup PR against this branch).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After spinning off #515, the fixup commit which shifts the gzip definition is efade57 (and not b0105cb). The change made by the commit is the same. Let me know if you want it squashed, dropped, or changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to layer.md.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to layer.md.

I am. It may be easier to just look at the fixup commit (efade57).

@stevvooe
Copy link
Contributor

While there will be complexities in making this a reality, I'm more in favor of this today than before.

For this PR to work it needs to make it clear that a layer DiffID is always the uncompressed hash, while the digest may be compressed or uncompressed based on the media type.

Please clean this up and rebase and let's see if we can make it work.

@wking wking force-pushed the uncompressed-layer-media-types branch from fd97198 to 63ce6dd Compare January 19, 2017 07:22
@wking
Copy link
Contributor Author

wking commented Jan 19, 2017 via email

@wking
Copy link
Contributor Author

wking commented Jan 19, 2017 via email

@@ -28,7 +28,8 @@ Changing it means creating a new derived image, instead of changing the existing
A layer DiffID is a SHA256 digest over the layer's uncompressed tar archive and serialized in the descriptor digest format, e.g., `sha256:a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9`.
Layers must be packed and unpacked reproducibly to avoid changing the layer DiffID, for example by using tar-split to save the tar headers.

NOTE: the DiffID is different than the layer digest because it is taken over the uncompressed gzipped layer for `application/vnd.oci.image.layer.v1.tar+gzip` types.
The difference between DiffIDs and the layer digests in the [manifest's `layers`](manifest.md#image-manifest-property-descriptions) is that the layer digest is taken over the blob regardless of compression, while the DiffID is taken after removing any compression.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking Right here: this sentence is super unclear. It is backwards. Assert the positive notion (what a DiffID is taken over), then qualify it. Clean this up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert the positive notion (what a DiffID is taken over)…

This happens in the preceding paragraph (“A layer DiffID is…”).

…then qualify it.

This paragraph is not attempting to qualify what a DiffID is. It is just underlining the difference between what a DiffID is (defined in the previous paragraph) and what a layer digest is (defined in manifest.md). A given uncompressed blob may have multiple compressed forms, but “the uncompressed blob associated with this potentially compressed blob” is unambiguous, so it made more sense to me to work it in the layer digest → DiffID direction.

However, if you would prefer different wording for this section, I'm happy to push a commit with whatever you want. Just tell me what it needs to be in a comment here or file a PR against my branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably remain as a note. It is a little odd to mention a manifest at this level and introducing this alternative without more context is jarring.

This should probably be along the lines of "Do not confuse DiffID with layer digests, often referenced in the manifest, which allow digests over compressed or uncompressed content."

wking added a commit to wking/image-spec that referenced this pull request Jan 19, 2017
On Thu, Jan 19, 2017 at 02:48:16PM -0800, Stephen Day wrote [1]:
> A DiffID is narrow. It is part of the config, in `rootfs`. A DiffID
> is only ever taken over the uncompressed content.
>
> A layer digest may be taken over the uncompressed or compressed
> version of the content.

I agree with both of these points, but have been unable to convince Stephen that this is what I'm trying to show with this example.

> Arguably, this is not relevant to this document.

This paragraph was intended as a clarifying example, but it was never
normative.  Since we can't agree on clear wording, we can hopefully
agree that it's not needed ;).

[1]: opencontainers#388 (comment)

Signed-off-by: W. Trevor King <[email protected]>
@@ -28,8 +28,6 @@ Changing it means creating a new derived image, instead of changing the existing
A layer DiffID is a SHA256 digest over the layer's uncompressed tar archive and serialized in the descriptor digest format, e.g., `sha256:a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9`.
Layers must be packed and unpacked reproducibly to avoid changing the layer DiffID, for example by using tar-split to save the tar headers.

NOTE: the DiffID is different than the layer digest because it is taken over the uncompressed gzipped layer for `application/vnd.oci.image.layer.v1.tar+gzip` types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not confuse DiffID with layer digests, often referenced in the manifest, which allow digests over compressed or uncompressed content.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. Should I make a commit with you as an author and your usual Signed-off-by?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is fine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with 46ec151a387698. Let me know if you want anything changed about that commit (e.g. the commit message).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this new wording no longer mentions media types at all, so we can spin it off into an orthogonal PR. Done with #515 and dropped from here.

@wking wking force-pushed the uncompressed-layer-media-types branch from 46ec151 to a387698 Compare January 19, 2017 23:08
wking pushed a commit to wking/image-spec that referenced this pull request Jan 19, 2017
Trevor and I couldn't agree on more detailed wording [1], and this
paragraph was never normative anyway.

[1]: opencontainers#388 (comment)

Signed-off-by: Stephen J Day <[email protected]>
Signed-off-by: W. Trevor King <[email protected]>
@wking wking force-pushed the uncompressed-layer-media-types branch from a387698 to a929074 Compare January 19, 2017 23:19
wking pushed a commit to wking/image-spec that referenced this pull request Jan 19, 2017
Trevor and I couldn't agree on more detailed wording [1], and this
paragraph was never normative anyway.

[1]: opencontainers#388 (comment)

Signed-off-by: Stephen J Day <[email protected]>
Signed-off-by: W. Trevor King <[email protected]>
@wking wking force-pushed the uncompressed-layer-media-types branch from a929074 to b0105cb Compare January 19, 2017 23:21
wking pushed a commit to wking/image-spec that referenced this pull request Jan 19, 2017
Trevor and I couldn't agree on more detailed wording [1], and this
paragraph was never normative anyway.

[1]: opencontainers#388 (comment)

Signed-off-by: Stephen J Day <[email protected]>
Signed-off-by: W. Trevor King <[email protected]>
@wking wking force-pushed the uncompressed-layer-media-types branch from b0105cb to efade57 Compare January 19, 2017 23:32
Trevor and I couldn't agree on more detailed wording [1], and this
paragraph was never normative anyway.

[1]: opencontainers#388 (comment)

Signed-off-by: Stephen J Day <[email protected]>
Signed-off-by: W. Trevor King <[email protected]>
@stevvooe
Copy link
Contributor

stevvooe commented Jan 19, 2017

LGTM after merging #515.

Approved with PullApprove

I'd prefer defining this as a structured syntax suffix following RFC
6839, and have filed a pull request to that effect [1].  However, the
current maintainer consensus seems to be to define the compressed and
uncompressed types directly without declaring a structured syntax
suffix pattern [2].  I'm not clear on the reason for avoiding the
structured syntax suffix, but that's the route I've taken in this
commit.

Now that you can choose both compressed or uncompressed media types,
it is easy to clarify DiffIDs by comparing types with and without the
+gzip compression.  media type.  It also allows you to create
image-layout instances where the layers are stored uncompressed, which
may be useful for cases such as:

* Binary diffing between layer blobs for cheaper updates of large
  layers [3].

* Compressing an image-layout tarball for a smaller smaller overall
  tarball (by avoiding the unnecessary fragmentation of compressing
  the individual blob entries).

[1]: opencontainers#332
[2]: opencontainers#332 (comment)
[3]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/%23opencontainers.2016-08-16.log.html#t2016-08-16T23:35:43

Signed-off-by: W. Trevor King <[email protected]>
Generated with:

  $ make img/media-types.png

and Graphviz version 2.38.0.

Signed-off-by: W. Trevor King <[email protected]>
@wking wking force-pushed the uncompressed-layer-media-types branch from efade57 to 35b9fae Compare January 19, 2017 23:46
@wking
Copy link
Contributor Author

wking commented Jan 19, 2017 via email

@vbatts
Copy link
Member

vbatts commented Jan 25, 2017

LGTM

Approved with PullApprove

1 similar comment
@stevvooe
Copy link
Contributor

stevvooe commented Jan 25, 2017

LGTM

Approved with PullApprove

@stevvooe stevvooe merged commit 84072d8 into opencontainers:master Jan 25, 2017
@wking wking deleted the uncompressed-layer-media-types branch January 26, 2017 05:17
wking added a commit to wking/image-spec that referenced this pull request Feb 2, 2017
Catch up with aad7f24 (media-types: Define layer media types with and
without '+gzip', 2016-09-20, opencontainers#388).

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/image-spec that referenced this pull request Mar 2, 2017
Catch up with aad7f24 (media-types: Define layer media types with and
without '+gzip', 2016-09-20, opencontainers#388).

Signed-off-by: W. Trevor King <[email protected]>
dattgoswami9lk5g added a commit to dattgoswami9lk5g/bremlinr that referenced this pull request Jun 6, 2022
Trevor and I couldn't agree on more detailed wording [1], and this
paragraph was never normative anyway.

[1]: opencontainers/image-spec#388 (comment)

Signed-off-by: Stephen J Day <[email protected]>
Signed-off-by: W. Trevor King <[email protected]>
7c00d pushed a commit to 7c00d/J1nHyeockKim that referenced this pull request Jun 6, 2022
Trevor and I couldn't agree on more detailed wording [1], and this
paragraph was never normative anyway.

[1]: opencontainers/image-spec#388 (comment)

Signed-off-by: Stephen J Day <[email protected]>
Signed-off-by: W. Trevor King <[email protected]>
7c00d added a commit to 7c00d/J1nHyeockKim that referenced this pull request Jun 6, 2022
Trevor and I couldn't agree on more detailed wording [1], and this
paragraph was never normative anyway.

[1]: opencontainers/image-spec#388 (comment)

Signed-off-by: Stephen J Day <[email protected]>
Signed-off-by: W. Trevor King <[email protected]>
laventuraw added a commit to laventuraw/Kihara-tony0 that referenced this pull request Jun 6, 2022
Trevor and I couldn't agree on more detailed wording [1], and this
paragraph was never normative anyway.

[1]: opencontainers/image-spec#388 (comment)

Signed-off-by: Stephen J Day <[email protected]>
Signed-off-by: W. Trevor King <[email protected]>
tomalopbsr0tt added a commit to tomalopbsr0tt/fabiojosej that referenced this pull request Oct 6, 2022
Trevor and I couldn't agree on more detailed wording [1], and this
paragraph was never normative anyway.

[1]: opencontainers/image-spec#388 (comment)

Signed-off-by: Stephen J Day <[email protected]>
Signed-off-by: W. Trevor King <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants