Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imgpkg fails to pull a bundle if a registry does not support the HEAD call. #131

Closed
ewrenn8 opened this issue Apr 19, 2021 · 5 comments
Closed
Assignees
Labels
bug This issue describes a defect or unexpected behavior

Comments

@ewrenn8
Copy link
Contributor

ewrenn8 commented Apr 19, 2021

What steps did you take:

We have discussed the steps to reproduce internally. Essentially, if a repo does not support the HEAD call on the manifests endpoint, imgpkg will error when checking for image existence, even if the image does exist.

What happened:
Bundle failed to pull

What did you expect:
Bundle to successfully be pulled since references image does exist and is publicly available.

Anything else you would like to add:

There were conversations around adding some fallback behavior which will issue a GET request in the case HEAD fails. @DennisDenuto pointed out crane did something similar in this pr

@ewrenn8 ewrenn8 added bug This issue describes a defect or unexpected behavior carvel triage This issue has not yet been reviewed for validity labels Apr 19, 2021
@DennisDenuto
Copy link
Contributor

wrt the 'fallback' behavior that crane uses, after some more thought there are concerns around performance and hitting rate limiting with registries such as dockerhub. (Dockerhub rate-limiting counts GET requests but not HEAD)

imgpkg uses HEAD /manifest to determine if a bundle has been relocated (imgpkg will short-circuit the loop upon encountering the first image in the bundle that doesn't exist, so I don't think we will run into any perf issues here), and to verify that every image in the bundle has been copied correctly (this is a concern).

We believe it won't be uncommon for bundles to contain 1000's of images in the future, hence the performance and rate-limiting concern.

I think we should attempt at 'fixing' the registries that are not conforming to the dist spec, but assuming we do that, this also leaves the question as to how imgpkg behaves with 'older' registries that don't conform.

Here is a straw-person idea:
What if we use GET requests when targeting a registry that doesn't play nice with HEAD, and for every other type of registry we use HEAD? (fyi I don't think there is a programmatic way to determine the provider or version of a registry. Perhaps a feature toggle
or new flag might be needed)

Note:
As a data-point, when HEAD in the code is changed to GET, the performance test for simulating a large "cf-for-k8s" type bundle goes from 55 seconds to 1 min 10 seconds

@cppforlife
Copy link
Contributor

I don't think there is a programmatic way to determine the provider or version of a registry. Perhaps a feature toggle or new flag might be needed

i think this becomes very complicated for tools that wrap imgpkg (like kapp-controller) -- package authors definitely do not know what registry will be used. package consumers probably shouldnt be configuring this kind of low level details. we should also probably think of this issue in context of #129.

As a data-point, when HEAD in the code is changed to GET, the performance test for simulating a large "cf-for-k8s" type bundle goes from 55 seconds to 1 min 10 seconds

love seeing this detail. it's interesting that there is such significant difference -- we are talking about just transferring text manifest. is the api call we are using for get somehow fetching more things than necessary?

@DennisDenuto
Copy link
Contributor

love seeing this detail. it's interesting that there is such significant difference -- we are talking about just transferring text manifest. is the api call we are using for get somehow fetching more things than necessary?

Thank you! and no we are just transferring manifest, I am guessing the perf hit is really on the dockerhub registry side... perhaps the manifest json resides on a slower db optimized for read/write, and the digest lives on a db optimized / cached for read? - explains why HEAD isn't rate-limited?)

package authors definitely do not know what registry will be used

True 💯

package consumers probably shouldnt be configuring this kind of low level details

They probably won't have to 9/10 of the time. But it is there when they encounter a 'strange' working registry... kinda like a 'break glass in case of emergency'. Although honestly, I think the proper solution is to fix the 'strange' working registry.

It is also unclear whether the original issue is due to a misconfigured registry (is there a caching layer, proxy, bad auth server) or whether there is a true incompatibility between imgpkg and Registry X

Throwing another idea (that I can capture in a separate issue):

Setup test github actions to run our tests across multiple key registries. (Instead of just the registry:v2 local docker image like we do today). To gain confidence that imgpkg is compatible across different providers and versions. (it kinda feels like the wild-west with how registries 'conform' to the dist spec). That would provide (to me) much needed data to help balance trade offs with different design decisions. i.e. is this problem wide-spread or just effect a single registry version and/or is this a problem with the registry or point to a mis-configured registry.

@pivotaljohn pivotaljohn removed the carvel triage This issue has not yet been reviewed for validity label Apr 26, 2021
@joaopapereira
Copy link
Member

The major issue that I see with the Get as a fallback, is the time hit that we will get for Bundles that were never copied, and only pushed. But if do not have a nicer way to approach this I think we might need to bite the bullet here.

Throwing another idea (that I can capture in a separate issue):

Setup test github actions to run our tests across multiple key registries. (Instead of just the registry:v2 local docker image like we do today). To gain confidence that imgpkg is compatible across different providers and versions. (it kinda feels like the wild-west with how registries 'conform' to the dist spec). That would provide (to me) much needed data to help balance trade offs with different design decisions. i.e. is this problem wide-spread or just effect a single registry version and/or is this a problem with the registry or point to a mis-configured registry.

I split the above into story #138

@joaopapereira
Copy link
Member

Released in version 0.8.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue describes a defect or unexpected behavior
Projects
None yet
Development

No branches or pull requests

5 participants