Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8s 1.23.14 (default) deployment fails #321

Closed
chess-knight opened this issue Jan 25, 2023 · 5 comments
Closed

K8s 1.23.14 (default) deployment fails #321

chess-knight opened this issue Jan 25, 2023 · 5 comments
Assignees
Labels
bug Something isn't working Container Issues or pull requests relevant for Team 2: Container Infra and Tooling
Milestone

Comments

@chess-knight
Copy link
Member

chess-knight commented Jan 25, 2023

Similar problem to #303.
Deployment gets stuck during bootstrap of first control plane because coredns container image cannot be pulled.

Example log of containerd on control plane:

Jan 25 10:46:36 capi-testcluster-control-plane-genc01-dcck4 containerd[545]: time="2023-01-25T10:46:36.830350539Z" level=info msg="PullImage \"registry.k8s.io/coredns:v1.8.6\""
Jan 25 10:46:37 capi-testcluster-control-plane-genc01-dcck4 containerd[545]: time="2023-01-25T10:46:37.554457011Z" level=error msg="PullImage \"registry.k8s.io/coredns:v1.8.6\" failed" error="rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/coredns:v1.8.6\": failed to resolve reference \"registry.k8s.io/coredns:v1.8.6\": registry.k8s.io/coredns:v1.8.6: not found"

I tested it and it seems that registry.k8s.io is used only from v1.23.15.
This also applies to v1.22.17 and v1.24.9.
PR #313 only checks major and minor versions, which seems insufficient.

I found that it was is fixed in cluster-api v1.2.9 so maybe we do not need this registry patching logic at all. Maybe we also don't need imageRepository in the KubeadmControlPlane spec. This issue in cluster-api is probably also related.

@chess-knight chess-knight added the bug Something isn't working label Jan 25, 2023
@garloff garloff self-assigned this Jan 30, 2023
@garloff garloff added the Container Issues or pull requests relevant for Team 2: Container Infra and Tooling label Jan 30, 2023
@garloff
Copy link
Member

garloff commented Jan 31, 2023

Thanks, @chess-knight !
The registry switch upstream is a PITA.
Trouble with capi >= 1.2.8 and omitting the imageRegistry setting is that certain patch levels will be blocked.
While upgrading to the most recent patch level is by far the most popular choice, I feel like not supporting lower versions is patronizing our users.
So we might need to continue adjusting the imageRegistry setting, but consider the patch levels...
Thoughts?

@chess-knight
Copy link
Member Author

I agree we should support lower versions and I think that patching is in this case better option, but if we want to patch the registry location, we should do it correctly. We can use e.g. cluster-api logic for that. I also found that this PR adds "critical" kubernetes patch versions as defaults, so at least it will not fail by default as in this case.

@garloff
Copy link
Member

garloff commented Jan 31, 2023

Yes, I will replicate the logic from capi -- that's the best source of truth indeed.
#315 was not merged b/c v1.26 was included. (It works, but is not officially supported).
I think we can merge #315 now, which makes this one a bit less critical, but of course we'll need to fix it still.

garloff added a commit that referenced this issue Feb 1, 2023
So we copy the capi knowledge of whether kube-proxy is at the old location
(k8s.gcr.io) or at the new one (registry.k8s.io).
As the logic became somewhat involed, split it out into a separate
script that can be tested more easily.

Signed-off-by: Kurt Garloff <[email protected]>
@garloff
Copy link
Member

garloff commented Feb 1, 2023

Should be fixed by #324.

garloff added a commit that referenced this issue Feb 14, 2023
* Address #321: Finegrained repo location patching.

So we copy the capi knowledge of whether kube-proxy is at the old location
(k8s.gcr.io) or at the new one (registry.k8s.io).
As the logic became somewhat involed, split it out into a separate
script that can be tested more easily.

* Fix parsing v1.xx (without a patchlevel).

* Make k8s version parsing more robust, improve comments.

* We might have v2.1.y (with single-digit minor version) some day, so
  make version parsing robust against it. Thanks @joshmue for pointing
  this out.
* More images, not just kube-proxy might be affected. Thanks
  @chess-knight.
* The registry location fixup is needed for new k8s (~Nov 2022), not
  >1.21 (thanks again @chess-knight).

Signed-off-by: Kurt Garloff <[email protected]>
@garloff
Copy link
Member

garloff commented Feb 22, 2023

Addressed by merged #324.

@garloff garloff closed this as completed Feb 22, 2023
@github-project-automation github-project-automation bot moved this from Backlog to Done in Sovereign Cloud Stack Feb 22, 2023
@garloff garloff added this to the R4 (v5.0.0) milestone Mar 12, 2023
garloff added a commit that referenced this issue May 1, 2023
* Address #321: Finegrained repo location patching.

So we copy the capi knowledge of whether kube-proxy is at the old location
(k8s.gcr.io) or at the new one (registry.k8s.io).
As the logic became somewhat involed, split it out into a separate
script that can be tested more easily.

* Fix parsing v1.xx (without a patchlevel).

* Make k8s version parsing more robust, improve comments.

* We might have v2.1.y (with single-digit minor version) some day, so
  make version parsing robust against it. Thanks @joshmue for pointing
  this out.
* More images, not just kube-proxy might be affected. Thanks
  @chess-knight.
* The registry location fixup is needed for new k8s (~Nov 2022), not
  >1.21 (thanks again @chess-knight).

Signed-off-by: Kurt Garloff <[email protected]>
garloff added a commit that referenced this issue May 1, 2023
* Address #321: Finegrained repo location patching.

So we copy the capi knowledge of whether kube-proxy is at the old location
(k8s.gcr.io) or at the new one (registry.k8s.io).
As the logic became somewhat involed, split it out into a separate
script that can be tested more easily.

* Fix parsing v1.xx (without a patchlevel).

* Make k8s version parsing more robust, improve comments.

* We might have v2.1.y (with single-digit minor version) some day, so
  make version parsing robust against it. Thanks @joshmue for pointing
  this out.
* More images, not just kube-proxy might be affected. Thanks
  @chess-knight.
* The registry location fixup is needed for new k8s (~Nov 2022), not
  >1.21 (thanks again @chess-knight).

Signed-off-by: Kurt Garloff <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Container Issues or pull requests relevant for Team 2: Container Infra and Tooling
Projects
Archived in project
Development

No branches or pull requests

2 participants