Skip to content
This repository has been archived by the owner on Mar 9, 2022. It is now read-only.

Limit size of additional annotation for avoiding unpack failure #1572

Merged
merged 1 commit into from
Sep 15, 2020

Conversation

ktock
Copy link
Member

@ktock ktock commented Sep 10, 2020

Related: containerd/stargz-snapshotter#144

In containerd, there is a size limit for label size (4096 chars).
If an image has many layers (> (4096-39)/72 > 56), containerd.io/snapshot/cri.image-layers will hit the limit of label size and the unpack will fail because the annotation will be passed to the snapshotter as a label.
This commit fixes this by limiting the size of the annotation.

@k8s-ci-robot
Copy link

Hi @ktock. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jianzi123
Copy link

I wonder if we need to omit some labels or if we need to change the logic of the valid method.

@ktock
Copy link
Member Author

ktock commented Sep 11, 2020

This takes the former approach (omitting the contents of the label). The Validate method is used globally so changing the logic isn't easy.
Maybe we should add validation when inheriting annotations as labels and filter invalid annotations out. But let's keep it separated from this PR now.

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after deciding what the plan is with respect to what happens if a layer goes over, should we have more labels etc., can you add a test bucket for this to match up expectations with output?

if containerdimages.IsLayerType(l.MediaType) {
ls := fmt.Sprintf("%s,", l.Digest.String())
// This avoids the label hits the size limitation.
if err := labels.Validate(targetImageLayersLabel, layers+ls); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want labels.Validate(targetImageLayersLabel, layers+l.Digest.String())

ls := fmt.Sprintf("%s,", l.Digest.String())
// This avoids the label hits the size limitation.
if err := labels.Validate(targetImageLayersLabel, layers+ls); err != nil {
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we need a debug out here indicating why the layer is missing.. but that in and of itself points out that we may need a new plan with additional labels.. either one per layer or some other idea

layers += ls
}
}
c.Annotations[targetImageLayersLabel] = strings.TrimSuffix(layers, ",")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we switch to prefix comma? iow validate labels.Validate(targetImageLayersLabel, layers+","+l.Digest.String()) and if that is fine += ","+l.Digest.String())..

@ktock ktock changed the title Limit size of additional annotation for avoiding unpack failure [WIP] Limit size of additional annotation for avoiding unpack failure Sep 14, 2020
@ktock ktock changed the title [WIP] Limit size of additional annotation for avoiding unpack failure Limit size of additional annotation for avoiding unpack failure Sep 14, 2020
@ktock
Copy link
Member Author

ktock commented Sep 14, 2020

@mikebrow Thanks for the comments.

after deciding what the plan is with respect to what happens if a layer goes over

As mentioned in the above, I think we should add validation when inheriting annotations as labels and filter invalid annotations out. How do you think about it? I'll open another PR (on containerd) if we need this.

can you add a test bucket for this to match up expectations with output?

The label containerd.io/snapshot/cri.image-layers is used by stargz snapshotter to resolve layers in an image in parallel and incomplete (even if empty) list is allowed (but it possibly slows down the snapshot preparation). So in this label's case, if the label size of a layer goes over the limit, I think that simply omitting the content of the label is enough. I added a small unit test for this. Please tell me if I'm missing something or should add more tests, etc.

@mikebrow
Copy link
Member

@mikebrow Thanks for the comments.

after deciding what the plan is with respect to what happens if a layer goes over

As mentioned in the above, I think we should add validation when inheriting annotations as labels and filter invalid annotations out. How do you think about it? I'll open another PR (on containerd) if we need this.

can you add a test bucket for this to match up expectations with output?

The label containerd.io/snapshot/cri.image-layers is used by stargz snapshotter to resolve layers in an image in parallel and incomplete (even if empty) list is allowed (but it possibly slows down the snapshot preparation). So in this label's case, if the label size of a layer goes over the limit, I think that simply omitting the content of the label is enough. I added a small unit test for this. Please tell me if I'm missing something or should add more tests, etc.

I see so perf only not critical. Just a warning or informational message should be fine then. Perhaps on a if first check.

@mikebrow
Copy link
Member

/ok-to-test

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good! Just a nit on the log level..

}
// This avoids the label hits the size limitation.
if err := validate(key, layers+item); err != nil {
log.G(ctx).WithError(err).WithField("label", key).Infof("%q is omitted in the layers list", l.Digest.String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe debug?

In containerd, there is a size limit for label size (4096 chars).
Currently if an image has many layers (> (4096-39)/72 > 56),
`containerd.io/snapshot/cri.image-layers` will hit the limit of label size and
the unpack will fail.
This commit fixes this by limiting the size of the annotation.

Signed-off-by: Kohei Tokunaga <[email protected]>
Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
on green

@@ -495,11 +487,33 @@ func appendInfoHandlerWrapper(ref string) func(f containerdimages.Handler) conta
}
c.Annotations[targetRefLabel] = ref
c.Annotations[targetDigestLabel] = c.Digest.String()
c.Annotations[targetImageLayersLabel] = layers
c.Annotations[targetImageLayersLabel] = getLayers(ctx, targetImageLayersLabel, children[i:], labels.Validate)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an extra change that doesn't really fix the problem, except maybe now we store less redundant data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"This" being changing from all layers to just children.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate this? Doesn't calling getLayers on children avoids the problem of hitting the label limit (see L511) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that the limit was per key, not total?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the limit is per key. If we need to have the "total" label size limit, what should be the limit (instead of 4096 chars per key)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way we have this now the root gets all the layers anyway, so the way we are doing children[:i] does not really fix the label size problem.

It is handled by truncating the values in getLayers, however.

So, 1.4.2 changed the behavior even for values that do not hit the limit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getLayers solves the label limit problem but children[:i] isn't for avoiding the label size limit. containerd.io/snapshot/cri.image-layers is currently a label for enabling snapshotter to pre-process(resolve) layers related to the target layer in parallel. So this is for optimization.

The reason why I do children[:i] here is otherwise children[0:limit] is always passed to the snapshotter and it can't get chance to pre-process children[limit:] layers.

However I agree with that we should find the better ways instead of children[:i].

@aojea
Copy link
Contributor

aojea commented Oct 30, 2021

@mikebrow can you please take a look at this kubernetes-sigs/kind#2518 and check if there was a regression?

@ktock
Copy link
Member Author

ktock commented Oct 31, 2021

@aojea Please see containerd/containerd#6187.

@aojea
Copy link
Contributor

aojea commented Oct 31, 2021

@aojea Please see containerd/containerd#6187.

❤️

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants