-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[receiver_creator/k8s] Add support for generating receivers based on provided annotations' hints #34427
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Pinging code owners for extension/observer/k8sobserver: @rmfitzpatrick @dmitryax. See Adding Labels via Comments if you do not have permissions to add labels yourself. |
That's a really interesting idea and I reckon it would be aligned with the auto instrumentation approach in OpenTelemetry. The K8s Operator is already available to inject auto-instrumentation for some SDKs, maybe we could reuse some of the annotations naming used there: annotations:
instrumentation.opentelemetry.io/inject-collector-receiver: redis/1
instrumentation.opentelemetry.io/inject-collector-receiver-config: '`endpoint`:6379' As I understood, the annotations will be on Pod spec level, if a Pod contains more than one container to be instrumented, how could we differentiate their different receiver's configurations? What do you think if we follow a similar approach as Metricbeat but the user needs to append a suffix to the annotation key? annotations:
instrumentation.opentelemetry.io/inject-collector-receiver: redis/1
instrumentation.opentelemetry.io/inject-collector-receiver-config: '`endpoint`:6379'
instrumentation.opentelemetry.io/inject-collector-receiver/2: redis/2
instrumentation.opentelemetry.io/inject-collector-receiver-config/2: '`endpoint`:1234' Sets of annotations can be provided with numeric or text prefixes. If there are annotations that don’t have a suffix, then they get grouped together into a single configuration. I think having a common prefix would ease the annotations detection and parsing. |
There is already an issue suggested this feature: #17418.
@rogercoll I would expect this to be possible if we somehow handle the hints per container/port like Update: For covering this for Logs we will need #35491 |
Hey @dmitryax , following our offline discussion I have put together the part for how the hints' API would look like. It's the I would suggest we start with the metrics use-case. |
@rogercoll @dmitryax I tried my hand at implementing the proposal for metrics: #35617 (logs can be added in a different Pr since it's anyways blocked on #35544). That's more or less how I imagine the implementation but I still want to play a bit around with some details. However if you could take a look and provide your feedback that would be great so to ensure that we are aligned on the direction. |
@ChrsMark I have taken a look at the #35617, the code looks good to me, but I have a concern regarding the extensibility of the The PR seems to be tied to specific configuration annotations, meaning that I can only "discover" the receivers that use those specific fields. The PR's supported annotations (receiver configuration) are
Do we plan to only support specific configuration annotations? Would it be feasible to just support the required ones for the k8sobserver logic's and use the other's as raw configuration? Something like: // well-known annotations
io.opentelemetry.collector.receiver-creator.metrics/receiver: nginx
io.opentelemetry.collector.receiver-creator.metrics/endpoint: "http://`endpoint`/nginx_status"
// any other annotation will be used as raw configuration field
io.opentelemetry.collector.receiver-creator.metrics/collection_interval: '20s'
io.opentelemetry.collector.receiver-creator.metrics/initial_delay: '20s'
io.opentelemetry.collector.receiver-creator.metrics/read_buffer_size: '10'
io.opentelemetry.collector.receiver-creator.metrics/xyz: 'abc' will be mapped to: nginx:
endpoint: "http://`endpoint`/nginx_status"
collection_interval: 20s
initial_delay: 20s
read_buffer_size: 10
xyz: abc |
My proposal would be to start supporting specific settings as annotations so as to have full control of what is allowed as a hint. This would serve as making it straight-forward for the users what specific annotations are supported as well as ensuring no security sensitive settings are exposed over this API. Every user with access to deploy Pods can potentially configure the Collector and take advantage of its privileges by just deploying a hinted Pod. In this, I would only allow Would love to hear @dmitryax's thoughts on this as well. |
I have some concerns about the suggested naming convention. Let's brainstorm it more.
What about something like this (I'm open to further discussion):
|
Thank's @dmitryax! Regarding the 1st point, in case we want to use the same set of hints with docker it is suggested to use reverse dns naming convention: https://docs.docker.com/engine/manage-resources/labels/#key-format-recommendations I'm fine with 2 and 3 points. However K8s does not allow multiple I think your suggestion would cover the needs of this feature and is quite future compliant with the potential proper 2 questions:
// once we have proper discovery, this would control particular pods turning it off/on
opentelemetry.io.discovery/enabled: "true"
// this one should be required as of right now, until we have the discovery
opentelemetry.io.discovery/scraper: nginx
// extra annotations to control scraper behavior
opentelemetry.io.discovery/signals: "metrics,logs"
opentelemetry.io.discovery/config.endpoint: "http://`endpoint`/nginx_status"
opentelemetry.io.discovery/config.collection_interval: '20s'
opentelemetry.io.discovery/config.initial_delay: '20s'
opentelemetry.io.discovery/config.read_buffer_size: '10'
opentelemetry.io.discovery/config.xyz: 'abc'
// nested configs
opentelemetry.io/discovery/config.nested: |
zyz: foo
bar: baz
// enable container logs collection
opentelemetry.io.discovery.container_logs/enabled: "true"
opentelemetry.io.discovery.container_logs/config.max_log_size: "2MiB"
opentelemetry.io.discovery.container_logs/config.foobar: "xyz"
opentelemetry.io.discovery.container_logs/config.operators: |
- type: regex_parser
regex: '^(?P<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) (?P<sev>[A-Z]*) (?P<msg>.*)$' We should also take into account the multiple containers case which would be covered by including the name of the port in the annotation for scraper: opentelemetry.io.discovery.<container_port>/scraper: nginx
opentelemetry.io.discovery.<container_port>/config.endpoint: "http://`endpoint`/nginx_status" and the name of the container for opentelemetry.io.discovery.container_logs.<container_name>/enabled: "true"
opentelemetry.io.discovery.container_logs.<container_name>/config.max_log_size: "2MiB" |
@dmitryax @rogercoll I have update the PR based on the above discussions. The approach looks legit, I'll keep testing more scenarios. Let me know if there is anything that does not make sense or missing in the above proposal and/or feel free to check the PR directly. |
Let's try not to restrict users from achieving particular configuration use cases but don't provide different ways to achieve the same results. Otherwise, it'll be much harder to maintain and introduce any changes to the interface if we ever have to do that.
We can stick with the reverse domain notation if we have the one What about starting with supporting the following:
Is there any limit on size of the annotation values in k8s so we can provide the whole config easily? |
Thank's @dmitryax!
I thought that providing support for nested configurations was a nice to have extension in addition to the base option which is to provide one config option per line/annotation. From my perspective it's more user-friendly to provide "one-liners" like However since it was also found confusing at #35617 (comment) I think we can proceed with one single approach and only support providing the whole config for now.
The validation code only restricts the total size of annotations to 256kB which equals to 262144 characters. I guess we should be fine with that fair limit.
I'm fine with the However, the As first step we can only support scraper annotations like the following:
Then we can consider the additional opinionated hint/suggestion to collect container logs through the filelog receiver in follow up PR/issue. PR tuned accordingly @dmitryax, let me know what you think. |
It's unlikely that logs (once we have it) and metrics will be collected by the same receiver/scraper. So we won't able to reuse |
So @dmitryax from your previous proposal the assumption is that (with only So in that case I believe it makes sense, yes, and we can conclude with the following: io.opentelemetry.discovery.metrics/enabled: "true"
io.opentelemetry.discovery.logs/enabled: "true"
io.opentelemetry.discovery.metrics.8888/enabled: "false" // takes priority
io.opentelemetry.discovery.log.my-container/enabled: "false" // takes priority
io.opentelemetry.discovery.logs/config: | # additional filelog receiver configuration here
max_log_size: "2MiB"
operators:
- type: regex_parser
regex: '^(?P<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) (?P<sev>[A-Z]*) (?P<msg>.*)$'
io.opentelemetry.discovery.metrics/scraper: "nginx"
io.opentelemetry.discovery.metrics/config: |
endpoint: "http://`endpoint`/nginx_status"
collection_interval: '20s'
initial_delay: '20s'
read_buffer_size: '10'
xyz: 'abc'
io.opentelemetry.discovery.metrics.8888/scraper: "prometheus" // takes priority |
@ChrsMark, looks good to me! 👍 |
Thank's @dmitryax @rogercoll for the feedback here! I have updated the PR and the logic seems to work quite smooth so far. I will keep testing (and possibly add more tests) but I would appreciate any reviews there already, mainly to validate the |
Since the metrics part has been merged I have opened a PR that adds support for the logs part based on #34427 (comment). |
Component(s)
receiver/receivercreator
Prior related proposal: #17418
Is your feature request related to a problem? Please describe.
This issue proposes to add support for a hints/suggestions based autodiscovery mechanism for the
receivercreator
. This was initially discussed at https://cloud-native.slack.com/archives/C01N5UCHTEH/p1718893771362289.Describe the solution you'd like
The goal is to provide support to the users for automatically enabling receivers by annotating their Pods properly:
Inspiration comes from Filebeat's/Metricbeat's and Fluentbit's respective features.
Describe alternatives you've considered
While such a dynamic workload monitoring approach is already supported by the
receivercreator
and thek8s_observer
, it still requires the users to have already predefined thereceivercreator
's sub-configs. This means that if we want to latertarget and discover a different workload we need to extend the Collector's config and restart the Collector's instances.
Providing the option to achieve the same result seamlessly would ease the experience for both the application developers who deploy workloads on K8s but also for the admins of the K8s cluster and/or the OTel based observability stack.
Additional context
Feature feasibility
I have already tried that out and it seems that the
receivercreator
can indeed be extended to support such a feature.There might be a need to distinguish the implementation for logs and metrics respectively, to best cover the default behaviors (i.e. we want to parse all logs from containers by default in a generic way unless we are hinted to use a technology specific "input". One thing to cover here would be to ensure that the feature would play well with a future templates provider)
A very dirty PoC can be found at 4591ecf.
This is just to illustrate the idea for now. Here is how it works:
values.yaml
for installing the Collector's Helm Chart:Target Redis Pod:
Collector's output:
Specification
Prerequisite: #35544
In the context of this feature we can focus on 2 primary goals:
For Pods with mutliple containers we need to ensure that we collect logs from each
one of them and that we can scope the metrics collection accordingly per container/port.
Collect Metrics dynamically from workloads
type: port
.Supported hints:
io.opentelemetry.collector.receiver-creator.metrics/receiver: redis
io.opentelemetry.collector.receiver-creator.metrics/endpoint: "`endpoint`:6379"
io.opentelemetry.collector.receiver-creator.metrics/username: "admin"
io.opentelemetry.collector.receiver-creator.metrics/password: "passpass"
io.opentelemetry.collector.receiver-creator.metrics/collection_interval: 10s
io.opentelemetry.collector.receiver-creator.metrics/timeout: 1m
With some more advanced options:
io.opentelemetry.collector.receiver-creator.metrics/raw: ''
To support users that want to provide more advanced configurations.
Extra
Additionally we can support defining hints per Pod's container/port i.e.:
io.opentelemetry.collector.receiver-creator.metrics.<port_name>/receiver: redis
. To support use cases where a Pod consists of multiple containers and we want to specify a specific portCollect Logs dynamically from workloads
type: pod.container
(see #35544).Supported hints:
io.opentelemetry.collector.receiver-creator.logs/enabled: true
-> This will enable a default filelog receiver for the specific pathio.opentelemetry.collector.receiver-creator.logs/template: redis
-> If/When we will have support for technology specific templates(Template provider opentelemetry-collector#8372)io.opentelemetry.collector.receiver-creator.logs/raw: ''
We could expose more settings of filelogreceiver through annotations. For example
io.opentelemetry.collector.receiver-creator.logs/max_log_size
,io.opentelemetry.collector.receiver-creator.logs/operators
Extra
Additionally we can support defining hints per Pod's container/port i.e.:
io.opentelemetry.collector.receiver-creator.logs.<container_name>/operators: redis
Next steps
I would like to get feedback from the community/SIG on this. Some pieces might be missing from this proposal, so any kind of feedback would be helpful here :).
Eventually, in case we find this a good fit for the Collector I'd be happy to contribute on its implementation (and future support of it).
We could potentially break this up to 2 different scopes, one for the metrics use-case and one for the logs use-case since the requirements are a bit different in each one of them.
The text was updated successfully, but these errors were encountered: