-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
after upgrade k3s continuous "1 controller.go:135] error syncing 'system-upgrade/k3s-agent' messages" #72
Comments
@hlugt is it always those same 3 errors? It looks like there might be an underlying networking issue from the perspective of the SUC. |
Yes, it is the same 3 errors. Maybe the (original) install/startup parameters of K3S are relevant to note? Any advice on what and how to check for these possible networking issues? (Considering to delete the suc pod... but then I will not be able to troubleshoot any further I guess) |
@hlugt these two (related?) errors look like the underlying cause to me:
|
Shoot, I wonder if you are running into k3s-io/k3s#1719 ? Is the coredns pod running on a different host than the SUC? |
No, they are both on the master. |
Yes, it would have been best to get into the suc controller to try the network access. From another pod (that had curl installed) I was able to access the url... Afraid though that I have rebuild my cluster (and forgot to stick to 1.17.4 to be able to upgrade again) due to a messed up rancher ui install waiting on api when trying to import cluster. For now I can imagine you want to put this on hold or even have me close it? |
We can leave it open for now. |
Ok, reverted to 1.17.4 and minimized amount of pods. Installed SUC and now the errors seem to stay away. Will check in a few hours to see whether the errors do not return. Regards (btw: think above mentioned rancher ui api error is due to self signed certs. It gave no issues when trying the secure k3s cluster import step, but curl gives back that I probably do need the insecure one...) |
Yep: all is well, no more flooding log messages. |
Version
rancher/system-upgrade-controller:v0.5.0
rancher/kubectl:v1.18.2
on
rancher/k3s: v1.18.2-k3s1
Platform/Architecture
arm64 (pine64: rock64pro master, rock64 nodes)
Describe the bug
After a succesfull upgrade run the, the upgrade agent pods are deleted, but the upgrade controller running on the maste keeps complaing with error syncing messages:
"1 controller.go:135] error syncing 'system-upgrade/k3s-agent': handler system-upgrade-controller: Operation cannot be fulfilled on "system-upgrade-controller": delaying object set, requeuing"
"1 controller.go:135] error syncing 'system-upgrade/k3s-server': handler system-upgrade-controller: Get https://update.k3s.io/v1-release/channels/latest: dial tcp: lookup update.k3s.io on 10.43.0.10:53: server misbehaving, requeuing" (NB: 10.43.0.10 is actual kube-dns pod address)
"1 controller.go:135] error syncing 'system-upgrade/k3s-agent': handler system-upgrade-controller: Get https://update.k3s.io/v1-release/channels/latest: dial tcp 10.0.0.1:443: connect: network is unreachable, requeuing"
To Reproduce
Have not tried to reproduce as this would most likely mean downgrading the k3s cluster and redoing upgrade .
Expected behavior
Would expect only messages polling the channel for upgrades or plan changes.
Actual behavior
Upgrade controller started upgrade, concurrency 1 on agents. Have seen to be expected behaviour: containers downloading executing, upgrading master/server, and on switching to agents, draining, upgrading. Succesfully restarted all jobs. All well it seems, except these continous flooding of error syncing in log.
Additional context
Added controller deployment and plan yaml's.
1.system-upgrade-controller.txt
2.k3s_upgrade-plan.txt
The text was updated successfully, but these errors were encountered: