Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take care of manifest config blob #10759

Closed

Conversation

miminar
Copy link

@miminar miminar commented Sep 1, 2016

Resolves #10730

Manifest V2 schema 2 introduces a special blob called config which used to be emeedded in earlier schema. See upstream's spec for details. The config is stored as a regular blob on registry's storage. Thus the config needs to be treated similar to a regular layer:

  1. the pullthrough middleware needs to be able to fetch it from remote repository
  2. we need to increase a size of image for the size of its config
  3. the config's digest needs to be cached as a regular layer inside a registry for subsequent lookups
  4. image pruning needs to prune it the same way as regular layer

The PR addresses the first 3 items. I'd like to cover the pruning case in a follow-up.

@miminar
Copy link
Author

miminar commented Sep 1, 2016

Not verified and untested. I'm working on it. But ready for review (cc @legionus, @soltysh).

[test]

@mfojtik
Copy link
Contributor

mfojtik commented Sep 1, 2016

@liggitt can you help review this?

@mfojtik mfojtik added this to the 1.3.0 milestone Sep 1, 2016
@miminar miminar force-pushed the take-care-of-manifest-config branch from 8d7be51 to 626b136 Compare September 1, 2016 13:20
@miminar
Copy link
Author

miminar commented Sep 1, 2016

The pullthrough part appears to be working now. Verifying a pull of local image of schema 2 tagged to different namespace now.

@miminar miminar force-pushed the take-care-of-manifest-config branch from 626b136 to dc82904 Compare September 1, 2016 13:23
@miminar
Copy link
Author

miminar commented Sep 1, 2016

And the second part (pull of local image of schema 2 tagged to different namespace) seems to be working as well. Writing tests now.

func (r *pullthroughBlobStore) Get(ctx context.Context, dgst digest.Digest) ([]byte, error) {
store, ok := r.digestToStore[dgst.String()]
if !ok {
data, getErr := r.BlobStore.Get(ctx, dgst)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just use err here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just use err here

OK, I'll rename.

@mfojtik
Copy link
Contributor

mfojtik commented Sep 1, 2016

this need tests (but I think you're already working on them)

@liggitt liggitt self-assigned this Sep 1, 2016
@@ -130,6 +136,28 @@ func (r *pullthroughBlobStore) ServeBlob(ctx context.Context, w http.ResponseWri
return nil
}

// Get attempts to fetch the requested blob by digest using a remote proxy store if necessary.
func (r *pullthroughBlobStore) Get(ctx context.Context, dgst digest.Digest) ([]byte, error) {
store, ok := r.digestToStore[dgst.String()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return early in the ok case and unindent the rest of the function?

if store, ok := r.digestToStore[dgst.String()]; ok {
  return store.Get(ctx, desc.Digest)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pullthroughBlobStore instances are per-request, right? just making sure we don't have to worry about locking/races on the digestToStore map, or about long-term growth/retention

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return early in the ok case and unindent the rest of the function?

good suggestion, I'll rewrite

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pullthroughBlobStore instances are per-request, right?

That's right.

@miminar miminar force-pushed the take-care-of-manifest-config branch from dc82904 to e22f4b2 Compare September 1, 2016 15:14
@@ -553,6 +557,10 @@ func (r *repository) rememberLayersOfImage(image *imageapi.Image, cacheName stri
for _, layer := range image.DockerImageLayers {
r.cachedLayers.RememberDigest(digest.Digest(layer.Name), r.blobrepositorycachettl, cacheName)
}
// remember reference to manifest config as well for schema 2
if image.DockerImageManifestMediaType == schema2.MediaTypeManifest && len(image.DockerImageMetadata.ID) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we only do this if len(image.DockerImageConfig) > 0?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we only do this if len(image.DockerImageConfig) > 0?

To make it comply with the other conditions, I'll use this instead. image.DockerImageConfig is filled only for manifest schema 2.

@miminar miminar force-pushed the take-care-of-manifest-config branch from e22f4b2 to 5c2786a Compare September 2, 2016 13:01
@soltysh
Copy link
Contributor

soltysh commented Sep 2, 2016

The changes LGTM, but tests is something I'd like to see before getting this merged.

Michal Minář added 2 commits September 2, 2016 18:14
Make sure to match image having config name equal to the wanted blob
digest. Remember the image and allow access to its layers.

Signed-off-by: Michal Minář <[email protected]>
Manifest configs are fetched using Get() method from blob store.
Pullthrough middleware needs to override it as well to allow for pulling
manifest v2 schema 2 images from remote repositories.

Signed-off-by: Michal Minář <[email protected]>
@miminar miminar force-pushed the take-care-of-manifest-config branch from 5c2786a to f8e5a5a Compare September 2, 2016 16:14
@miminar miminar changed the title [WIP] Take care of manifest config blob Take care of manifest config blob Sep 2, 2016
@miminar
Copy link
Author

miminar commented Sep 2, 2016

I've got the e2e tests working finally.

@miminar miminar force-pushed the take-care-of-manifest-config branch 2 times, most recently from d5dd631 to 43f6f95 Compare September 2, 2016 17:28
@miminar
Copy link
Author

miminar commented Sep 2, 2016

Flake #9624, extended conformance flake:

with readiness probe should not be ready before initial delay and never restart [Conformance]

And

InsufficientInstanceCapacity => We currently do not have sufficient m4.large capacity in the Availability Zone you requested (us-east-1d). Our system will be working on provisioning additional capacity. You can currently get m4.large capacity by not specifying an Availability Zone in your request or choosing us-east-1b, us-east-1c.
+ echo ''\''vagrant up'\'' failed - retrying'

In https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/8662/.

exit 1
fi
local podnamejs='{range .subsets[*]}{range .addresses[*]}{.targetRef.name},{end}{end}'
wait_for_command "oc get endpoints/docker-registry -o 'jsonpath=$podnamejs' --config='${ADMIN_KUBECONFIG}' | egrep -q '(^|,)${registrypod},'" $TIME_MIN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this necessary?

Not sure, I was just reluctant to remove the existing guards.

In my local test, this passes in no time:

[INFO] Success running command: 'oc get rc/docker-registry-1 --template "{{ index .metadata.annotations \"openshift.io/deployment.phase\" }}" --config='/tmp/openshift/test-end-to-end//openshift.local.config/master/admin.kubeconfig' | grep Complete' after 23 seconds
[INFO] Waiting for command to finish: 'oc get endpoints/docker-registry -o 'jsonpath={range .subsets[*]}{range .addresses[*]}{.targetRef.name},{end}{end}' --config='/tmp/openshift/test-end-to-end//openshift.local.config/master/admin.kubeconfig' | egrep -q '(^|,)docker-registry-1-6mc18,''...
[INFO] Success running command: 'oc get endpoints/docker-registry -o 'jsonpath={range .subsets[*]}{range .addresses[*]}{.targetRef.name},{end}{end}' --config='/tmp/openshift/test-end-to-end//openshift.local.config/master/admin.kubeconfig' | egrep -q '(^|,)docker-registry-1-6mc18,'' after 0 seconds
[INFO] Waiting for command to finish: 'oc get 'pod/docker-registry-1-6mc18' -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}' --config='/tmp/openshift/test-end-to-end//openshift.local.config/master/admin.kubeconfig' | grep -qi true'...
[INFO] Success running command: 'oc get 'pod/docker-registry-1-6mc18' -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}' --config='/tmp/openshift/test-end-to-end//openshift.local.config/master/admin.kubeconfig' | grep -qi true' after 0 seconds

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mfojtik, @Kargakis do you think it's safe to remove all the wait_for_command statements except for the first?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant all the additions to wait_for_registry, not just that line

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant all the additions to wait_for_registry, not just that line

That's because I'm redeploying the registry. Until now, the wait_for_registry expected just rc version 1. Now it needs to deal with multiple versions of rcs, pods, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to make sure that we are testing against the latest version of registry (in case the registry is re-deployed).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a log from one of the recent runs:

[INFO] Waiting for command to finish: 'oc get rc/docker-registry-2 --template "{{ index .metadata.annotations \"openshift.io/deployment.phase\" }}" --config='/tmp/openshift/test-end-to-end-docker//openshift.local.config/master/admin.kubeconfig' | grep Complete'...
Complete
[INFO] Success running command: 'oc get rc/docker-registry-2 --template "{{ index .metadata.annotations \"openshift.io/deployment.phase\" }}" --config='/tmp/openshift/test-end-to-end-docker//openshift.local.config/master/admin.kubeconfig' | grep Complete' after 16 seconds
[INFO] Waiting for command to finish: 'oc get endpoints/docker-registry -o 'jsonpath={range .subsets[*]}{range .addresses[*]}{.targetRef.name},{end}{end}' --config='/tmp/openshift/test-end-to-end-docker//openshift.local.config/master/admin.kubeconfig' | egrep -q '(^|,)docker-registry-2-n8o46,''...
[INFO] Success running command: 'oc get endpoints/docker-registry -o 'jsonpath={range .subsets[*]}{range .addresses[*]}{.targetRef.name},{end}{end}' --config='/tmp/openshift/test-end-to-end-docker//openshift.local.config/master/admin.kubeconfig' | egrep -q '(^|,)docker-registry-2-n8o46,'' after 1 seconds
[INFO] Waiting for command to finish: 'oc get pod -l deploymentconfig=docker-registry -o jsonpath='{.items[*].status.conditions[?(@.type=="Ready")].status}' --config='/tmp/openshift/test-end-to-end-docker//openshift.local.config/master/admin.kubeconfig' | grep -qi true'...
[INFO] Success running command: 'oc get pod -l deploymentconfig=docker-registry -o jsonpath='{.items[*].status.conditions[?(@.type=="Ready")].status}' --config='/tmp/openshift/test-end-to-end-docker//openshift.local.config/master/admin.kubeconfig' | grep -qi true' after 0 seconds

The oc get endpoint needed additional 1 second to succeed after the deployment of dc-2 completed which IMHO justifies first two wait statements. I'm still not sure about the third (wait for a pod to become ready). I'd prefer to keep it though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, finally, here's a justification for the 3rd wait:

[INFO] Success running command: 'oc get endpoints/docker-registry -o 'jsonpath={range .subsets[*]}{range .addresses[*]}{.targetRef.name},{end}{end}' --config='/tmp/openshift/test-end-to-end-docker//openshift.local.config/master/admin.kubeconfig' | egrep -q '(^|,)docker-registry-1-mwn9s,'' after 0 seconds
[INFO] Waiting for command to finish: 'oc get 'pod/docker-registry-1-mwn9s' -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}' --config='/tmp/openshift/test-end-to-end-docker//openshift.local.config/master/admin.kubeconfig' | grep -qi true'...
[INFO] Success running command: 'oc get 'pod/docker-registry-1-mwn9s' -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}' --config='/tmp/openshift/test-end-to-end-docker//openshift.local.config/master/admin.kubeconfig' | grep -qi true' after 1 seconds

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if we're waiting for the version, the waits make sense, I was referring to all the additions in the helper function

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified a lot the wait function. Kudos to @liggitt for the suggestions.

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to ebc85ed

@liggitt
Copy link
Contributor

liggitt commented Sep 5, 2016

[testextended][extended:core(builds)]

@openshift-bot
Copy link
Contributor

Evaluated for origin testextended up to ebc85ed

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/testextended Running (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin_extended/459/) (Extended Tests: core(builds), core(images))

@@ -660,7 +660,9 @@ function install_registry() {
readonly -f install_registry

function wait_for_registry() {
wait_for_command "oc get endpoints docker-registry --template='{{ len .subsets }}' --config='${ADMIN_KUBECONFIG}' | grep -q '[1-9][0-9]*'" $((5*TIME_MIN))
local generation=$(oc get dc/docker-registry -o jsonpath='{.metadata.generation}')
local onereplicajs='{.status.observedGeneration},{.status.replicas},{.status.updatedReplicas},{.status.availableReplicas}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quote expansions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, thats for the line above this

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to ebc85ed

os::cmd::expect_success 'oc login -u schema2-user -p pass'
os::cmd::expect_success "oc new-project schema2tagged"
os::cmd::expect_success "oc tag --source=istag schema2/busybox:latest busybox:latest"
busybox_name=$(oc get -o 'jsonpath={.image.metadata.name}' istag busybox:latest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quotes

@stevekuznetsov
Copy link
Contributor

Since e2e was supposed to be a "user process" script... Is that the right place to put all these new tests?

@liggitt
Copy link
Contributor

liggitt commented Sep 5, 2016

See comment about a follow up to split these into a registry extended test

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/merge FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/8710/)

@liggitt
Copy link
Contributor

liggitt commented Sep 5, 2016

Looks like this picked up a spurious --source arg

https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin_integration/5787/consoleFull#16372028715762928fe4b031e1b25ced2a

Running test/end-to-end/core.sh:523: executing 'docker tag --source=docker busybox '172.30.169.212:5000/schema2/busybox'' expecting success...
FAILURE after 0.083s: test/end-to-end/core.sh:523: executing 'docker tag --source=docker busybox '172.30.169.212:5000/schema2/busybox'' expecting success: the command returned the wrong error code
There was no output from the command.
Standard error from the command:
flag provided but not defined: --source
See '/usr/bin/docker-current tag --help'.

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/8711/)

@liggitt
Copy link
Contributor

liggitt commented Sep 5, 2016

superceded by #10805

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. priority/P1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants