You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When attempting to deploy a nocloud cluster (in my case on a Harvester cluster) the network platform config controller will continually panic attempting to read the network-config from a cidata partition. This is despite the partition containing the correct file.
Description
I have not had the opportunity to recompile Talos to enable more debugging but by all appearances #9351 has not only added the requirement for cidata/network-config but it also doesn't seem to be able to find it even when it is there...at least in my case. I have mounted my cloud-init partition onto a different linux host and verified the necessary files are there and legal (Ubuntu is happy to parse and use it for example):
root@test:~# mount /dev/vdb /mnt
mount: /mnt: WARNING: source write-protected, mounted read-only.
root@test:~# cd /mnt
root@test:/mnt# ls
meta-data network-config user-data
root@test:/mnt# cat network-config
network:
version: 2
ethernets:
eth0:
dhcp4: false
In truth I actually have no interest in using the network-config part of cloud-init, I only want to use the user-data that contains my network config in my machine template.
I should also note this doesn't seem to prevent the cluster from becoming usable, after bootstrapping everything works perfectly fine, but the logs are spammed with this failure.
Logs
user: warning: [2024-10-27T02:11:58.936377322Z]: [talos] waiting for devices to be ready...
user: warning: [2024-10-27T02:11:58.979707322Z]: [talos] found config disk (cidata) at /dev/vdb
kern: debug: [2024-10-27T02:11:58.981563322Z]: ISO 9660 Extensions: Microsoft Joliet Level 3
kern: debug: [2024-10-27T02:11:58.982009322Z]: ISO 9660 Extensions: RRIP_1991A
user: warning: [2024-10-27T02:11:58.982071322Z]: [talos] fetching meta config from: cidata/meta-data
user: warning: [2024-10-27T02:11:58.983794322Z]: [talos] fetching network config from: cidata/network-config
user: warning: [2024-10-27T02:11:58.985538322Z]: [talos] failed to read network-config
user: warning: [2024-10-27T02:11:58.986687322Z]: [talos] fetching machine config from: cidata/user-data
kern: info: [2024-10-27T02:11:58.989573322Z]: init[2139]: segfault at 0 ip 0000000000f36d8a sp 000000c000b67c18 error 4 in init[400000+2837000] likely on CPU 3 (core 3, socket 0)
kern: info: [2024-10-27T02:11:58.992675322Z]: Code: 0f 10 44 24 70 41 0f 11 40 10 48 89 f0 48 8b 8c 24 d0 00 00 00 48 8b 54 24 40 48 8b 9c 24 80 00 00 00 4c 8b 84 24 b8 00 00 00 <4d> 8b 08 49 83 f9 01 0f 84 a2 00 00 00 49 83 f9 02 75 42 4c 89 c3
user: warning: [2024-10-27T02:11:58.996799322Z]: [talos] platform panicked {"component": "controller-runtime", "controller": "network.PlatformConfigController", "stack": "github.com/siderolabs/talos/internal/app/machined/pkg/controllers/network.(*PlatformConfigController).runWithPanicHandler.func1\n\t/src/internal/app/machined/pkg/controllers/network/platform_config.go:564\nruntime.gopanic\n\t/toolchain/go/src/runtime/panic.go:770\nruntime.panicmem\n\t/toolchain/go/src/runtime/panic.go:261\nruntime.sigpanic\n\t/toolchain/go/src/runtime/signal_unix.go:881\ngithub.com/siderolabs/talos/internal/app/machined/pkg/runtime/v1alpha1/platform/nocloud(*Nocloud).ParseMetadata\n\t/src/internal/app/machined/pkg/runtime/v1alpha1/platform/nocloud/nocloud.go:54\ngithub.com/siderolabs/talos/internal/app/machined/pkg/runtime/v1alpha1/platform/nocloud(*Nocloud).NetworkConfiguration\n\t/src/internal/app/machined/pkg/runtime/v1alpha1/platform/nocloud/nocloud.go:136\ngithub.com/siderolabs/talos/internal/app/machined/pkg/contr...
user: warning: [2024-10-27T02:11:59.012959322Z]: [talos] restarting platform network config {"component": "controller-runtime", "controller": "network.PlatformConfigController", "interval": "1m0.8004681s", "error": "panic: runtime error: invalid memory address or nil pointer dereference"}
user: warning: [2024-10-27T02:11:59.912971322Z]: [talos] controller failed {"component": "controller-runtime", "controller": "k8s.KubeletStaticPodController", "error": "error refreshing pod status: error fetching pod status: an error on the server (\"Authorization error (user=apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)\") has prevented the request from succeeding"}
Environment
Talos version: v1.8.1
Kubernetes version: v1.30.5
Platform: Harvester
The text was updated successfully, but these errors were encountered:
The bug was logical: first the check was done for one of the values to
be non-nil, and after that one of the values was assumed to be non-nil,
while it could have been nil.
While fixing that, linter figured out that raw metadata config is never
needed outside of `acquireConfig`, so this got dropped as well,
simplifying the code even more.
Fixessiderolabs#9578
Signed-off-by: Andrey Smirnov <[email protected]>
smira
added a commit
to smira/talos
that referenced
this issue
Oct 28, 2024
The bug was logical: first the check was done for one of the values to
be non-nil, and after that one of the values was assumed to be non-nil,
while it could have been nil.
While fixing that, linter figured out that raw metadata config is never
needed outside of `acquireConfig`, so this got dropped as well,
simplifying the code even more.
Fixessiderolabs#9578
Signed-off-by: Andrey Smirnov <[email protected]>
The bug was logical: first the check was done for one of the values to
be non-nil, and after that one of the values was assumed to be non-nil,
while it could have been nil.
While fixing that, linter figured out that raw metadata config is never
needed outside of `acquireConfig`, so this got dropped as well,
simplifying the code even more.
Fixessiderolabs#9578
Signed-off-by: Andrey Smirnov <[email protected]>
(cherry picked from commit 3a0a17a)
Bug Report
When attempting to deploy a
nocloud
cluster (in my case on a Harvester cluster) the network platform config controller will continually panic attempting to read the network-config from a cidata partition. This is despite the partition containing the correct file.Description
I have not had the opportunity to recompile Talos to enable more debugging but by all appearances #9351 has not only added the requirement for
cidata/network-config
but it also doesn't seem to be able to find it even when it is there...at least in my case. I have mounted my cloud-init partition onto a different linux host and verified the necessary files are there and legal (Ubuntu is happy to parse and use it for example):In truth I actually have no interest in using the network-config part of cloud-init, I only want to use the user-data that contains my network config in my machine template.
I should also note this doesn't seem to prevent the cluster from becoming usable, after bootstrapping everything works perfectly fine, but the logs are spammed with this failure.
Logs
Environment
The text was updated successfully, but these errors were encountered: