-
Notifications
You must be signed in to change notification settings - Fork 873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add auto-detection of container information to resource if cgroupv2 is used #6694
Comments
JavaScript PR for reference: open-telemetry/opentelemetry-js-contrib#1181 |
I think this issue belongs in the instrumentation repo now |
@mateuszrzeszutek the code that needs to be changed for that is living in |
It is, but that's the deprecated copy -- we've recently decided to move all resource providers to this repo (cause they kind of are instrumentations), and deprecate the old ones in otel-java. Here's the same code from this repo: https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/resources/library/src/main/java/io/opentelemetry/instrumentation/resources/ContainerResource.java |
Ah, ok, that's good to know! |
Refactor class `ContainerResource` so that its methods return `Optional`s instead of `null`. This makes for safer code that doesn't rely on `null` to signal the absence of result. Moreover, `Optional.map()` and `Optional.orElseGet()` use the same functional style as `Stream`, which is well readable. Issue-Id: open-telemetry#6694 Issue-Id: open-telemetry/opentelemetry-java#2337
Several test cases in `ContainerResourceTest` have the same outcome, so instead of duplicating the assertions inside each test method, take advantage of `@ParameterizedTest` to inject inputs and expected outputs multiple times into the same test. This improves test code readability by reducing test method length and making identical cases more obvious. Also rename test methods to better express the tested code's expected behaviour. Issue-Id: open-telemetry#6694 Issue-Id: open-telemetry/opentelemetry-java#2337
Refactor class `ContainerResource` so that its methods return `Optional`s instead of `null`. This makes for safer code that doesn't rely on `null` to signal the absence of result. Moreover, `Optional.map()` and `Optional.orElseGet()` use the same functional style as `Stream`, which is well readable. Issue-Id: open-telemetry#6694 Issue-Id: open-telemetry/opentelemetry-java#2337
Several test cases in `ContainerResourceTest` have the same outcome, so instead of duplicating the assertions inside each test method, take advantage of `@ParameterizedTest` to inject inputs and expected outputs multiple times into the same test. This improves test code readability by reducing test method length and making identical cases more obvious. Also rename test methods to better express the tested code's expected behaviour. Issue-Id: open-telemetry#6694 Issue-Id: open-telemetry/opentelemetry-java#2337
Several test cases in `ContainerResourceTest` have the same outcome, so instead of duplicating the assertions inside each test method, take advantage of `@ParameterizedTest` to inject inputs and expected outputs multiple times into the same test. This improves test code readability by reducing test method length and making identical cases more obvious. Also rename test methods to better express the tested code's expected behaviour. Issue-Id: open-telemetry#6694 Issue-Id: open-telemetry/opentelemetry-java#2337
…e` to avoid using null (#6889) While I was looking at issues #6694 and open-telemetry/opentelemetry-java#2337, I saw that the code in `io.opentelemetry.instrumentation.resources.ContainerResource` used `null` several times as return value which isn't safe. Nowadays, `Optional` is better suited to signal the absence of a result, so I refactored `ContainerResource` to use `Optional`s instead of null. On the way, I also refactored this class's unit tests into parameterised tests to reduce test code duplication. These improvements should help implementing a solution to #6694. Co-authored-by: Trask Stalnaker <[email protected]>
…e` to avoid using null (open-telemetry#6889) While I was looking at issues open-telemetry#6694 and open-telemetry/opentelemetry-java#2337, I saw that the code in `io.opentelemetry.instrumentation.resources.ContainerResource` used `null` several times as return value which isn't safe. Nowadays, `Optional` is better suited to signal the absence of a result, so I refactored `ContainerResource` to use `Optional`s instead of null. On the way, I also refactored this class's unit tests into parameterised tests to reduce test code duplication. These improvements should help implementing a solution to open-telemetry#6694. Co-authored-by: Trask Stalnaker <[email protected]>
…e` to avoid using null (open-telemetry#6889) While I was looking at issues open-telemetry#6694 and open-telemetry/opentelemetry-java#2337, I saw that the code in `io.opentelemetry.instrumentation.resources.ContainerResource` used `null` several times as return value which isn't safe. Nowadays, `Optional` is better suited to signal the absence of a result, so I refactored `ContainerResource` to use `Optional`s instead of null. On the way, I also refactored this class's unit tests into parameterised tests to reduce test code duplication. These improvements should help implementing a solution to open-telemetry#6694. Co-authored-by: Trask Stalnaker <[email protected]>
…e` to avoid using null (open-telemetry#6889) While I was looking at issues open-telemetry#6694 and open-telemetry/opentelemetry-java#2337, I saw that the code in `io.opentelemetry.instrumentation.resources.ContainerResource` used `null` several times as return value which isn't safe. Nowadays, `Optional` is better suited to signal the absence of a result, so I refactored `ContainerResource` to use `Optional`s instead of null. On the way, I also refactored this class's unit tests into parameterised tests to reduce test code duplication. These improvements should help implementing a solution to open-telemetry#6694. Co-authored-by: Trask Stalnaker <[email protected]>
This resolves #6694. We've been tracking the update to cgroup version support and want to get ahead of the widespread usage. The surface of the existing `ContainerResource` has not changed, but its internals have been factored out to two "extractor" utilities -- one that understands cgroup v1 and another for v2. v1 is attempted and, if successful, the result is used. If v1 fails, then the `ContainerResource` will fall back to v2. As mentioned in #6694, the approach taken in this PR is borrowed from [this SO post](https://stackoverflow.com/questions/68816329/how-to-get-docker-container-id-from-within-the-container-with-cgroup-v2) combined with local experimentation on docker desktop on a Mac, which already uses cgroup2 v2.
thanks @breedx-splk for fixing this :-) |
…e` to avoid using null (open-telemetry#6889) While I was looking at issues open-telemetry#6694 and open-telemetry/opentelemetry-java#2337, I saw that the code in `io.opentelemetry.instrumentation.resources.ContainerResource` used `null` several times as return value which isn't safe. Nowadays, `Optional` is better suited to signal the absence of a result, so I refactored `ContainerResource` to use `Optional`s instead of null. On the way, I also refactored this class's unit tests into parameterised tests to reduce test code duplication. These improvements should help implementing a solution to open-telemetry#6694. Co-authored-by: Trask Stalnaker <[email protected]>
@svrnm I wanted to highlight that this approach will not work for k8s as the container Id in k8s pod spec is different from the one we are extracting from proc/self/mouninfo and hence any correlation with k8s pod will break. |
@Sprakhar97 Can you link to the part of the spec that you're referring to? Are you saying that the parsing will still work, but the ID is semantically something different? Any thoughts on how to fix this? Do you know of a better mechanism for finding the current pod id in a compatible (correlatable) way? |
I assume this comment from @biswajit-nanda on #8462 is related. @XSAM reported the same issue a while back for go, this is not unique to java. Overall, there is a major issue with container id detection, since none of the approaches taken is reliable, see also containerd/containerd#8185 |
Yes, this could happen to every language's implementation, as the id in the cgroup v2 file may not be the current container's id, it could be the k8s busybox container's id. |
Follow-up to open-telemetry/opentelemetry-java#3308 & related ticket with js: open-telemetry/opentelemetry-js-contrib#1173
Is your feature request related to a problem? Please describe.
Detection of the
container.id
inContainerResource.java
currently works by reading from/proc/self/cgroup
. That does not work when cgroupv2 is turned on. However, based on these two sources there's another place to read from:In short, the
container.id
could be read from /proc/self/mountinfo as well. This also works forcontainerd
and not only docker.Eventually it would be good to have a standard for that, but unfortunately this is a long-standing non-moving OCI issue, until then a combination of both approaches seems to be reliable enough.
Describe the solution you'd like
Add code that reads the container id from
/proc/self/mountinfo
if reading from/proc/self/cgroup
failed.Describe alternatives you've considered
As stated in the description, eventually it would be great to have this standardized, but until then this is the best way to go.
cc: @lo-jason , @PeterF778
The text was updated successfully, but these errors were encountered: