You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Reboot the node and when the node-exporter pod comes up, it silently exits after printing the last line included in the log.
What did you expect to see?
Suspecting that sometimes the host volumes mounted into the pod sandbox environment are not yet ready while the node exporter is started. However it is expected that node exporter should detect any missing prerequisites and complain about them or throw an error (at least a warning ) which is not the current case.
What did you see instead?
NE exits without leaving any trace of why it exited.
The text was updated successfully, but these errors were encountered:
This does not sound like a node_exporter issue, but a Kubernetes issue.
There are no "detected prerequisites" in the node_exporter. Every scrape is dynamic.
The node_exporter logs any errors returned, the only way to exit with no errors is for the webserver function to exit with no errors, usually by a SIGTERM or SIGKILL.
IMO, I don't think it's k8s issue as the other pods are running without any issue. This issue is seen only when we reboot the node and NE tries to come up. NE is moved to CrashLoopBackoff state and is rendered in that state until we edit the daemonset/pod spec. Now the moment I edit the spec (even with minimal changes like reducing the probe time only to cause the pod to restart ), NE comes up. So I suspect that after the sandbox creation when the container is coming up it just exits possibly because it is getting into some condition which is not handled. I am mounting the volumes as below:
Also I have a query in this context. When it crashes, do we get any trace in the log? Let me know if any information is needed which I am missing to provide.
Host operating system: output of
uname -a
Linux my-cluster 5.14.21-150500.55.44-default #1 SMP PREEMPT_DYNAMIC Mon Jan 15 10:03:40 UTC 2024 (cc7d8b6) x86_64 x86_64 x86_64 GNU/Linux
node_exporter version: output of
node_exporter --version
node_exporter, version 1.7.0 (branch: HEAD, revision: 7333465)
node_exporter command line flags
node_exporter log output
ts=2024-02-22T06:26:53.719Z caller=node_exporter.go:192 level=info msg="Starting node_exporter" version="(version=1.7.0, branch=HEAD, revision=7333465abf9efba81876303bb57e6fadb946041b)"
ts=2024-02-22T06:26:53.719Z caller=node_exporter.go:193 level=info msg="Build context" build_context="(go=go1.21.6, platform=linux/amd64, user=root@f86e9674f8f3, date=20240219-04:42:07, tags=netgo osusergo static_build)"
ts=2024-02-22T06:26:53.719Z caller=filesystem_common.go:111 level=info collector=filesystem msg="Parsed flag --collector.filesystem.mount-points-exclude" flag=^/(dev|proc|run/credentials/.+|sys|var/lib/docker/.+|var/lib/containers/storage/.+)($|/)
ts=2024-02-22T06:26:53.719Z caller=filesystem_common.go:113 level=info collector=filesystem msg="Parsed flag --collector.filesystem.fs-types-exclude" flag=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
ts=2024-02-22T06:26:53.720Z caller=diskstats_common.go:111 level=info collector=diskstats msg="Parsed flag --collector.diskstats.device-exclude" flag=^(ram|loop|fd|(h|s|v|xv)d[a-z]|nvme\d+n\d+p)\d+$
ts=2024-02-22T06:26:53.720Z caller=diskstats_linux.go:265 level=error collector=diskstats msg="Failed to open directory, disabling udev device properties" path=/run/udev/data
ts=2024-02-22T06:26:53.720Z caller=node_exporter.go:110 level=info msg="Enabled collectors"
ts=2024-02-22T06:26:53.720Z caller=node_exporter.go:117 level=info collector=cpu
ts=2024-02-22T06:26:53.720Z caller=node_exporter.go:117 level=info collector=diskstats
ts=2024-02-22T06:26:53.720Z caller=node_exporter.go:117 level=info collector=filesystem
ts=2024-02-22T06:26:53.720Z caller=node_exporter.go:117 level=info collector=loadavg
ts=2024-02-22T06:26:53.720Z caller=node_exporter.go:117 level=info collector=meminfo
ts=2024-02-22T06:26:53.720Z caller=node_exporter.go:117 level=info collector=mountstats
ts=2024-02-22T06:26:53.720Z caller=node_exporter.go:117 level=info collector=netclass
ts=2024-02-22T06:26:53.720Z caller=node_exporter.go:117 level=info collector=netdev
ts=2024-02-22T06:26:53.720Z caller=node_exporter.go:117 level=info collector=textfile
ts=2024-02-22T06:26:53.720Z caller=node_exporter.go:117 level=info collector=timex
ts=2024-02-22T06:26:53.720Z caller=node_exporter.go:117 level=info collector=uname
ts=2024-02-22T06:26:53.720Z caller=node_exporter.go:117 level=info collector=xfs
ts=2024-02-22T06:26:53.721Z caller=tls_config.go:274 level=info msg="Listening on" address=:9100
ts=2024-02-22T06:26:53.721Z caller=tls_config.go:310 level=info msg="TLS is enabled." http2=true address=:9100
Are you running node_exporter in Docker?
Running as a pod in the kubernetes cluster
What did you do that produced an error?
Reboot the node and when the node-exporter pod comes up, it silently exits after printing the last line included in the log.
What did you expect to see?
Suspecting that sometimes the host volumes mounted into the pod sandbox environment are not yet ready while the node exporter is started. However it is expected that node exporter should detect any missing prerequisites and complain about them or throw an error (at least a warning ) which is not the current case.
What did you see instead?
NE exits without leaving any trace of why it exited.
The text was updated successfully, but these errors were encountered: