Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] OTel collector modules #11631

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jsoriano
Copy link
Contributor

@jsoriano jsoriano commented Nov 8, 2024

This change adds a new RFC that attempts to summarize the previous discussions around templates and the options that at the moment look more active.

The main output I would expect from the acceptance of this RFC is to reach an agreement on the overall implementation approach. There are still some open questions mentioned that we might let open by now.

So far there have been several attempts to introduce some kind of configuration templating in the OTel collector, and there seems to be some controversy about the steps forward. With this RFC we are proposing to limit the scope to the use case of sharing reusable configurations to collect signals from well-known services and applications. And with this scope in mind, consider use cases where this feature could be useful, as with autodiscovery. We also propose to call this feature "modules".

I skip in-depth technical details, but reference POCs where some of the discussed approaches are implemented.

I would like to thank @djaglowski for all his efforts driving previous discussions and experimentation. And @evan-bradley, @mx-psi, @rogercoll and others for their comments. It would be great if you could take a look to this RFC.

Related issues and PRs

cc @rogercoll @ChrsMark @andrzej-stencel @mlunadia

@jsoriano jsoriano requested a review from a team as a code owner November 8, 2024 18:07
@jsoriano jsoriano requested a review from codeboten November 8, 2024 18:07
Copy link

codecov bot commented Nov 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.58%. Comparing base (d39dd7a) to head (0db8c96).
Report is 91 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #11631   +/-   ##
=======================================
  Coverage   91.58%   91.58%           
=======================================
  Files         440      440           
  Lines       23763    23763           
=======================================
  Hits        21764    21764           
  Misses       1627     1627           
  Partials      372      372           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@mx-psi mx-psi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this at KubeCon at the OpenTelemetry Observatory on 2024-11-15 (thanks @rogercoll!).

I am supportive of fixing the underlying issue of being able to simplify configuration and share it easily, and I agree that the current configuration resolution mechanism, while complex, falls short of users' expectations. IMO we need to figure out a few questions before moving forward with this proposal (some of which are partially answered by the current text but I would prefer to answer them more explicitly):

  1. Why not using Helm/Kustomize? Other projects such as Kubernetes don't have an explicit builtin templating mechanism but rather have tooling on top of the underlying configuration. We need to justify why we need to add this.
  2. How could we extend this in the future for other kinds of components? Extensions, service::telemetry, and exporters may be places where we also want to have shareable configuration snippets. How are we going to fulfill those use cases if people ask for them in the future? An initial implementation needs not add support for those, but the design needs to take them into account.
  3. Where does the current configuration resolution mechanism fall short and why? We have some functionality to reuse snippets of code via the concept of confmap.Providers. This seems not be enough, but we should justify why (I think the two main pieces missing are parameters and merging slices) and why it is not possible to extend this mechanism to fulfill new use cases (I am not convinced it is not possible). If we don't go with confmap.Providers, how will this proposal interact with them? What about confmap.Converters?
  4. Can we do a PoC that happens entirely within a receivertemplate component? This could live outside of core and could help us validate the ideas spoused here.

I am going to request changes until we answer those questions in text. I know it is a high bar, but our configuration resolution system is quite complex as it is, so we need to figure this out in a way that fits this existing system and justifies the additional maintenance burden. I am happy to help in the discussion of any of these questions :)

@mx-psi mx-psi added the Skip Changelog PRs that do not require a CHANGELOG.md entry label Nov 19, 2024
@jsoriano
Copy link
Contributor Author

jsoriano commented Nov 21, 2024

@mx-psi thanks for your comment, and thanks too to @rogercoll for bringing the discussion to KubeCon.

  1. Why not using Helm/Kustomize? Other projects such as Kubernetes don't have an explicit builtin templating mechanism but rather have tooling on top of the underlying configuration. We need to justify why we need to add this.

Helm is limited to Kubernetes and we would like this feature to work on any environment where the collector can be deployed.
Kustomize can be less limited to kubernetes but doesn't seem so straightforward to use for users, nor provides a so clear abstraction for sharing OTel configurations.

For the Kubernetes use case, I see Helm and Kustomize as tools that could be combined with modules/templates more than an alternative. For example perhaps Helm could be used to install modules/templates or Kustomize to manage configurations that include modules/templates. In other environments modules could be also used in the context of other config management tools.

I will try to explore these possible overlaps and combinations during next week and provide more context in the RFC.

Other projects such as Kubernetes don't have an explicit builtin templating mechanism but rather have tooling on top of the underlying configuration.

Agree, this is why we'd like to frame this more as an abstraction layer for reusing complex configurations than a solution for general templating. This can be seen as a way to implement receivers or processors without writing Go code.

Imagine for example an Couchbase expert with a finely tuned configuration who is able to share it by doing some small tweaks on it for parametrization, and a Couchbase user being able to easily reuse this configuration in a way that is consistent with basic collector configurations.

How could we extend this in the future for other kinds of components? Extensions, service::telemetry, and exporters may be places where we also want to have shareable configuration snippets. How are we going to fulfill those use cases if people ask for them in the future? An initial implementation needs not add support for those, but the design needs to take them into account.

We haven't found so clear use cases for other components, but this could be definitely added in the future. For example for extensions or exporters I imagine that we could also have something like extensiontemplate or exportertemplate components that follow similar logic to the ones we are proposing by now. I will add a mention to that in the "Future possibilities" section.

Other use cases where general templating is wanted will be probably better covered with config management tools. I would let this out of the scope of this feature.

3. Where does the current configuration resolution mechanism fall short and why? We have some functionality to reuse snippets of code via the concept of confmap.Providers. This seems not be enough, but we should justify why (I think the two main pieces missing are parameters and merging slices) and why it is not possible to extend this mechanism to fulfill new use cases (I am not convinced it is not possible). If we don't go with confmap.Providers, how will this proposal interact with them? What about confmap.Converters?

I will be more explicit about this in the RFC, but as a summary:

  • I agree it would be possible to provide similar features with config resolution mechanisms, but would require users to have a good understanding of it. They need to know what files to provide, with what format and in what order. We would be missing the goals of providing a clear abstraction for sharing complex configurations and of lowering the entry barrier for common use cases.
  • Config providers would couple templating with the source of the templates, it may be complex or not possible to implement different sources for templates. It would also require non-basic understanding of how configuration works. And it might not work well with the receiver creator for autodiscovery use cases.
  • Converters would also work, actually @djaglowski has a quite nice POC, but there are some architectural problems and some limitations as discussed in comments in the POC and mentioned in the RFC.

If we don't go with confmap.Providers, how will this proposal interact with them?

As the proposed components will work as usual components, they can be included in configurations provided from any provider. I would limit very much the inclusion of references to providers in templates because this can open the gates to security problems.

This could be a point in favor of using a config resolver as the templating mechanism, or part of it, we have more control on the receivers that are allowed in templates. (We use a config resolver for variable substitution in our POC).

4. Can we do a PoC that happens entirely within a receivertemplate component? This could live outside of core and could help us validate the ideas spoused here.

There is a POC in elastic/opentelemetry-collector-components#96, we can polish it a bit more and merge it at least in our repo so it is available for further testing and validation.

A more real-world example of a template for this POC can be found in https://github.com/elastic/integrations/pull/11253/files#diff-26a3b4eab2c9b845b76591d786a09cd833ac1527288dd2e0c4e2c457fc853521
We will add more examples to the POC and to the RFC.

I am going to request changes until we answer those questions in text. I know it is a high bar, but our configuration resolution system is quite complex as it is, so we need to figure this out in a way that fits this existing system and justifies the additional maintenance burden. I am happy to help in the discussion of any of these questions :)

👍 thanks for the help with the discussion!

Copy link
Contributor

github-actions bot commented Dec 6, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added Stale and removed Stale labels Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Skip Changelog PRs that do not require a CHANGELOG.md entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants