-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding fix for host.name field update based on k8s.node.name #37
Conversation
@lahsivjar @AlexanderWert We first thought that this would be fixed with the The issue is that the processor only enriches traces, meaning that we cannot use the processor in metrics/logs OTLP pipeline. The configuration changes used in this PR uses the |
resource/hostname: | ||
attributes: | ||
- key: host.name | ||
from_attribute: k8s.node.name | ||
action: upsert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something the k8sattributes
should do by default? If so, can we propose these changes upstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could be an option, similar to the override
functionality in the resource detector processor
(e.g. overrides the host.name value with the cloud instance ID). But I have two concerns:
- Upstream agreement with
host.name == k8s.node.name
: looks like a custom workaround to fix the Infrastructure UI relationships. From the container perspective, it has its own host.name value, which is actually used among the cluster for communication. - Otlp data that already contains
k8s
resource attributes that won't be processed by the k8sattributes processor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upstream agreement with host.name == k8s.node.name: looks like a custom workaround to fix the Infrastructure UI relationships. From the container perspective, it has its own host.name value, which is actually used among the cluster for communication.
IMHO, that is a symptom of host.name
in SemConv being underspecified. In particular, there's no clear statement about the semantics of the host.name
in containerized environments.
In particular it says ... On Unix systems, it may contain what the hostname command returns, or the fully qualified hostname ...
. For containerized system that means that SDKs will report that value which from the perspective from within a container is per default the pod name. I'm not sure if that behavior is really intended by SemConv or just not thought through well enough. The consequence is, that host.name
is not a reliable attribute with that. So I don't think it's an Elastic-specific challenge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we need to understand all scenarios of the host.name population per different k8s setup. Created internal issue to summarise all the above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The consequence is, that host.name is not a reliable attribute with that. So I don't think it's an Elastic-specific challenge.
I agree. That's why I'm skeptical that we should fix it with Elastic-specific configuration. Maybe we should embrace that the data could be messy and adjust the UI accordingly. For example, exclude metrics in the hosts view that have a kubernetes.pod.uid
.
I'm fine with this as a short-term workaround but I feel like we're adding a lot of these fixes to the configuration that aren't or shouldn't be Elastic-specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should embrace that the data could be messy and adjust the UI accordingly. For example, exclude metrics in the hosts view that have a kubernetes.pod.uid.
+1 Or as @lahsivjar was suggesting, change the UI logic to rely on k8s.node.name
and then fallback to host.name
.
Adds the needed
resource/hostname
processor that will update thehost.name
field based on k8s.node.nameBefore:
After:
Relates:
Closes: https://github.com/elastic/opentelemetry-dev/issues/516
Found in: https://github.com/elastic/opentelemetry-dev/issues/559