-
Notifications
You must be signed in to change notification settings - Fork 509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deployment daemonset failed due to failed to create "memory_limiter" processor, permission denied #543
Comments
@wadexu007 it appears your collector doesn't have permission to see how much memory it has been allocated. I've not encountered this error before, is there anything else in your environement you can share that would cause a container to have restricted permissions. @dmitryax have you see this error before? I don't see any known issues with the collector not being able to access this information, but if we can find the root cause of the permission issue, and they are valid, then we may need to consider reverting back to hard-coded values by default. cc @puckpuck |
Please share details about the K8s environment this was deployed to |
@puckpuck |
@wadexu007 does the issue happen on k8s 1.23+ ? |
One of my colleagues mentioned similar issue, we are running 1.23.8 , I didn't dig this further but the resolution for now was to use limit_mib and spike_limit_mib |
The permission issue could be within node's operating system configuration, GKE's host node is running a version of the Container-Optimized OS, which by default comes with a lot of security features. |
I was just running into the same problem on a Gardener cluster with k8s 1.23.13. I assume that the hostmetrics preset should mount the hosts root folder into a custom directory and then configure the @TylerHelmuth If you think that is going into the right direction I could give that a try and work on a PR. |
I changed the folder name where the root filesystem for the hostMetrics gets mounted to
When disabling the memoryLimiter, an extension seems to run into the same problem:
|
@povilasv did you run into this issue when updating the hostmetrics preset recently? |
It seems the switch from chart version 0.39.2 to 0.39.3 of chart opentelemetry-collector is breaking it So probably that change by @puckpuck to use percentage based limiting is showing up the symptoms: #513 |
@a-thaler the problem is definitely caused by trying to use percents because that tells the memory limiter to go do some lookups and its the lookups that are failing. I am surprised that it is not working as expected though, I get no errors locally in For now the workaround is to configure the memory_limiter processor in |
I'm confused why this only happens when the |
Nope, we had this error prior to hostMetrics change, and we didn't change any mounting logic in that PR only removed some envvars -> https://github.com/open-telemetry/opentelemetry-helm-charts/pull/549/files#diff-d3c8687b50b2f7b2ca10ff878367b16b76ead0cfdf62548091b5bcc507dc2d68 Maybe those envvars had some impact? |
@TylerHelmuth I believe you are on cgroupv2 (not v1) which does not have this parsing of mounts and should have no problems. Could you check? |
The workaround of configuring fixed sizes for memory-limiter and ballast extension works fine. In the /sys/fs/cgroup it is written where the cgroupsV2 files are located which is then read by the limiter. By doing that re-mount of the hosts root filesystem.. the original path taken from the file becomes a soft link now? but there the limiter has no permissions to read? |
@povilasv you're right, I've got cgroupv2 |
got the same issue in our k8 deployments, not sure if its cgroups v1 or v2
also used the static config of memroy_limiter with limit_mib but got the error like below
we are running on
|
It seems that the bug is in the |
I use the same config and deployed the collector as StatefulSet and this error doesn't happen but it does throw this error when running as DaemonSet. |
Please assign this to me for now, there is some work / discussions going on in open-telemetry/opentelemetry-collector#6825 :) |
The newest release should have have the fix, so this should be solved with #655 |
Maybe someone can test the newest chart and check if issue is gone? @wadexu007 or @a-thaler maybe? |
@povilasv will test it today and will provide feedback |
@povilasv I can confirm that the issue is solved. With chart version 0.49.1 the collector starts properly using the hostMetrics preset in combination with percentage based memory_limiter settings |
Awesome, thanks for verifying. Closing the issue :) |
Issue Summary:
Deployment daemonset failed due to failed to create "memory_limiter" processor, permission denied when enable hostMetrics.
On env:
Steps to reproduce:
helm install otel-collector open-telemetry/opentelemetry-collector -f test.yaml
CrashLoopBackOff
Workaround is set a fix value for memory_limiter -
limit_mib
Can you help to take a look at this issue?
Thanks
The text was updated successfully, but these errors were encountered: