Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker pull with bad tag pulls by digest #22112

Closed
duglin opened this issue Apr 18, 2016 · 25 comments · Fixed by #27207
Closed

docker pull with bad tag pulls by digest #22112

duglin opened this issue Apr 18, 2016 · 25 comments · Fixed by #27207
Labels
area/distribution kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed.
Milestone

Comments

@duglin
Copy link
Contributor

duglin commented Apr 18, 2016

Given two images:

REPOSITORY          TAG                 IMAGE ID            CREATED              SIZE
dug                 v2                  4b51b717cb17        12 minutes ago       188 MB
dug                 v1                  9f1af75fa84c        12 minutes ago       188 MB

When I do docker run dug:v1 and docker run dug:v2 the correct images are found/used.
However, if instead of the image tag I used a portion of the image ID then instead of failing it works.
For example: docker run dug:9f1 will run dug:v1 instead of saying it can't find an image with tag 9f1.
In looking at the docs ( https://docs.docker.com/engine/reference/commandline/pull/#pull-an-image-by-digest-immutable-identifier ) it talks about pulling image by digest but only shows examples of the syntax: ubuntu@sha256:45b23dee08af5e43a - notice the @sha256 in there.

It seems that docker shouldn't assume a "tag" is a "digest" unless it also includes the "sha256" portion.

I believe this is happening due to the code here: https://github.com/docker/docker/blob/master/daemon/daemon.go#L1150 - it shouldn't be calling daemon.imageStore.Search(tagged.Tag()).

However, if this is the expected behavior then we should update the docs to let people know that "tags" will be treated as "image IDs" because this isn't what I was expecting.

ping @tonistiigi

@tonistiigi
Copy link
Member

This isn't the pull-by-digest case that uses @ separator and manifest digest but repo:shortid that has been supported format for a long time. But imagestore behavior seems not to be quite correct as I think this pattern should require fixed length, not just a prefix.

@duglin
Copy link
Contributor Author

duglin commented Apr 18, 2016

the repo:shortid form is not one I've seen before - are there docs that talk about this? Perhaps this is just a doc issue then - aside from the fixed length stuff you mentioned.

@duglin
Copy link
Contributor Author

duglin commented Apr 18, 2016

ping @bainsy88

@tonistiigi
Copy link
Member

@thaJeztah Maybe you can comment on the docs side? This may have been something that we decided not to promote anymore but never deprecated in the code.

@thaJeztah
Copy link
Member

Hm, I'm actually not aware of the "repo:shortid" notation, so sounds like a bug to me

@thaJeztah thaJeztah added area/distribution kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. labels Apr 18, 2016
@thaJeztah
Copy link
Member

I'll label as a bug, but possibly someone knows if there's indeed some historic format that I'm not aware of

@duglin
Copy link
Contributor Author

duglin commented Apr 18, 2016

given we now support the @sha256: prefix, which is really nice because its not ambiguous as to what the rest of the chars are, I'd prefer if the non-prefix form was just a "tag" and nothing else. IOW, +1 to this being a bug.

@tonistiigi
Copy link
Member

#799

@thaJeztah
Copy link
Member

oh! docker 0.4.1 - just before I started using it. I'm +1 to deprecating it, in favor of the new notation

@bainsy88
Copy link

+1 for deprecating, the particularly nasty scenario the current behavior exposes is as follows:

User/script tries to run animage:45 which is expected to be on a host. Tag 45 isn't actually there and animage:22, by coincidence, has an image id starting 45, this will result in the run succeeding but launching a different tag of the image than the user expects.

Above may sound unlikely given the random IDs but I have seen it happen a couple of times.

@thaJeztah
Copy link
Member

@duglin want to work on this?

@duglin
Copy link
Contributor Author

duglin commented Apr 19, 2016

Perhaps, but I'd like confirmation from the other maintainers that this change in direction is what people want first.

@thaJeztah
Copy link
Member

ping @aaronlehmann @stevvooe any objections against this change?

@aaronlehmann
Copy link
Contributor

I support deprecating the repo:shortid notation.

@vdemeester
Copy link
Member

Same here, it doesn't even seems to be documented so.. 👼

@duglin
Copy link
Contributor Author

duglin commented Apr 19, 2016

to be clear... are we removing repo:id or just repo:shortid ? Both?

@aaronlehmann
Copy link
Contributor

I would prefer to remove both. I don't see the value of repo:id when the ID could be used directly. The grammar would become less ambiguous if it was removed.

@duglin
Copy link
Contributor Author

duglin commented Apr 19, 2016

@aaronlehmann I agree but wanted to make sure I wasn't alone in the thinking :-)

@thaJeztah
Copy link
Member

Same for me; SGTM

@stevvooe
Copy link
Contributor

SGTM

I was not aware of this behavior. If we did "short id" lookup, it should be on the other side of @, sans the sha256 prefix.

@duglin
Copy link
Contributor Author

duglin commented Apr 19, 2016

going to have to do this in stages since we need to deprecate it first before we remove it, but does this sound right to people:

repo      -> gets repo w/tag: latest
repo:xxx  -> gets repo w/tag: xxx
repo@xxx  -> gets repo w/sha xxx
repo@sha256:xxx -> get repo w/sha xxx
repo@xxx:yyy  -> gets repo w/sha xxx AND tag yyy

So the generic form is:

repo[@[sha256:]hash][:tag]

@stevvooe I wasn't sure if you wanted to totally ban the inclusion of the sha256 prefix or not. It feels a little odd to me that we would ban it when it does appear as part of the docker ID output/syntax.

@stevvooe
Copy link
Contributor

@duglin repo@xxx:yyy -> gets repo w/sha xxx AND tag yyy is incorrect. The : and @ are purposefully mutually exclusive. The grammar allows repo[:tag][@digest], but we typically disallow that to avoid ambiguous cases.

The current behavior is erroneous and undocumented. It should be fixed before exploring other options. I was merely commenting that if we did have repo-qualified, short id lookup, it would be something like repo@xxx where xxx would be a shortened identifier, with the algorithm dropped. I have no opinion about supporting this case, but that is where it should be given the current grammar.

@goosemo
Copy link

goosemo commented May 10, 2016

I think there is still a bug here. The inspect on an image gives an Id that doesn't seem to be valid. For example when I look to try and reference the sha256 of the redis image I can't pull the same exact image that I have locally:

$ docker inspect --format "{{.Id}}" redis:3
sha256:4f5f397d4b7ca414891bd2959ef71c83bb7010d095efb2497f0b2f407cb50f0d

$ docker pull redis@sha256:4f5f397d4b7ca414891bd2959ef71c83bb7010d095efb2497f0b2f407cb50f0d
Error response from daemon: manifest unknown: manifest unknown

$ docker version
Client:
 Version:      1.11.1
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   5604cbe
 Built:        Tue Apr 26 23:44:17 2016
 OS/Arch:      darwin/amd64

Server:
 Version:      1.11.1
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   5604cbe
 Built:        Wed Apr 27 00:34:20 2016
 OS/Arch:      linux/amd64

Though with some further inspections it seems that I might also have the Id and Repo Digest confused? Though strangely this image in the example redis:3 has no content in the repo digests field, and as such is unable to be targeted to a specific build.

$ docker inspect --format "{{.RepoDigests}}" redis:3
[]

As an aside:

If it's unknown why one would want to have this ability, I'll try and explain a use case. Say a build system uses the docker-compose command that has images used with tags such as dev or latest that would not be attached to the same image throughout time. If there was an error in that run, one of the first steps a dev would want to take would be to recreate the error locally.

If I we able to render out the a docker-compose.${BUILD_NUMBER}.yml file at build time that is archived as an artifact. Where all the human readable, and mutable tags are replaced with the image id's themselves. The developer could just use this compose file of the build, and get an identical environment for testing that was used by the build servers.

@stevvooe
Copy link
Contributor

@goosemo

Though with some further inspections it seems that I might also have the Id and Repo Digest confused?

Yes.

The bug here is that the RepoDigests isn't being populated (really this should be called "remote digests", so I'll call it that). sha256:4f5f397d4b7ca414891bd2959ef71c83bb7010d095efb2497f0b2f407cb50f0d, from the example above, is an application specific identifier and the RepoDigests takes the simple hash of the resources as pushed by the remote.

The main issue here is that we don't save the gzipped layers between pushes. These do get fixed at push, which creates tar-split files, at which time the remote digest can be calculated.

Perhaps, the solution here is to have the ability to generate the remote digest without actually pushing. This would go through the process of compressing the layers and saving the tar-split files to fix the remote hash.

@ncdc
Copy link
Contributor

ncdc commented Sep 28, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/distribution kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants