Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: On environment variable update, the new pod of zot that gets created goes into crashloopbackoff #2733

Open
kiransripada22 opened this issue Oct 21, 2024 · 7 comments
Labels
bug Something isn't working rm-external Roadmap item submitted by non-maintainers

Comments

@kiransripada22
Copy link

zot version

v2.1.1

Describe the bug

Hi,

We have configured zot as a kubernetes deployment and added a flux-cd controller to track any changes that happen to this deployment and update the clusters based on the changes.

We used to have zot v1.4.3 which never had any issue with this rolling update whenever something changes in the zot deployment.

But we started facing issue once we upgraded to zot v2.1.1

After the upgrade, whenever we update any environment variable, instead of creating a new pod that replaces the old running pod, we are now getting a new pod that keeps going to crash loop and the old pod stays the same.

We have to manually go and delete the old pod for the crash loop to stop.

We checked the logs and the below is the log we found in zot container

{"level":"error","error":"timeout","goroutine":1,"caller":"zotregistry.dev/zot/pkg/cli/server/root.go:76","time":"2024-10-16T11:39:06.856027479Z","message":"failed to init controller"}

Error: timeout

To reproduce

  1. Install zot image as a deployment in kubernetes.
  2. Update any environment variable in the kubernetes pod.

Expected behavior

New pod of zot should be created that replaces the old pod.

Screenshots

No response

Additional context

No response

@kiransripada22 kiransripada22 added the bug Something isn't working label Oct 21, 2024
@rchincha
Copy link
Contributor

rchincha commented Oct 21, 2024

@kiransripada22

v1.4.x -> v2.x.x is a major version upgrade path and we don't guarantee backward compatibility in this case.

However, that said, the best approach would be to setup a v2.x.x zot and setup sync/miror from v1.4.3 and then do the rolling upgrades thereafter.

@rchincha rchincha added the rm-external Roadmap item submitted by non-maintainers label Oct 21, 2024
@andaaron
Copy link
Contributor

andaaron commented Oct 21, 2024

@kiransripada22, do you have anything specific in the configuration which is shared between the zot instances? Maybe shared storage? Are you using zot or zot-minimal? Do you have any specific extensions enabled? Do you use authentication?

@kiransripada22
Copy link
Author

@rchincha Sorry if i am not clear, but i am facing this issue with a fresh installation of zot V2.1.1 . I had a complete new installation of zot V 2.1.1 and in that cluster when we did a rolling update, we are facing an issue where it fails to init controller.

@andaaron

  1. I am using zot-minimal.
  2. I have auth enabled with htpasswd
  3. I had sync enabled with another container registry
  4. This is a rolling update scenario. So i think storage is same.

Note: Also I found that when we do kubernetes update with Recreate Deployment Strategy it works, but the scenario we are using needs rolling update

@rchincha
Copy link
Contributor

@kiransripada22

Wondering if you need this: #2730

@kiransripada22
Copy link
Author

@rchincha I think that may not fully fix it because if we delete the existing pod, the controller seems to not have any issue initialising. So this could be a resource availability issue when we do the rolling update

@eusebiu-constantin-petu-dbk
Copy link
Collaborator

eusebiu-constantin-petu-dbk commented Oct 31, 2024

Hei, got the same error message if I try to start two zots with same "meta.db".

{"level":"error","error":"timeout","goroutine":1,"caller":"zotregistry.dev/zot/pkg/cli/server/root.go:76","time":"2024-10-31T19:37:11.158557001+02:00","message":"failed to init controller"}
petu@DESKTOP-1JC5Q85:~/zot/zot$ ls /tmp/zot
_trivy  cache.db  meta.db
petu@DESKTOP-1JC5Q85:~/zot/zot$ 

@rchincha
Copy link
Contributor

rchincha commented Nov 7, 2024

The issue is to achieve "continuous" uptime, we need a single mutually exclusive db shared between two instances?
We need to think about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working rm-external Roadmap item submitted by non-maintainers
Projects
None yet
Development

No branches or pull requests

4 participants