From aa754cda2ece153041e5b26557c58953431d8408 Mon Sep 17 00:00:00 2001 From: Jaime Soriano Pastor Date: Wed, 6 Nov 2024 11:31:18 +0100 Subject: [PATCH 1/4] Add RFC for reusable configuration modules --- docs/rfcs/configuration-modules.md | 228 +++++++++++++++++++++++++++++ 1 file changed, 228 insertions(+) create mode 100644 docs/rfcs/configuration-modules.md diff --git a/docs/rfcs/configuration-modules.md b/docs/rfcs/configuration-modules.md new file mode 100644 index 00000000000..c64fae365b8 --- /dev/null +++ b/docs/rfcs/configuration-modules.md @@ -0,0 +1,228 @@ +# Shareable modular configuration + +Add support for shareable modular configuration to collect signals from specific +services or applications. + +## Motivation + +Distributing high level configurations focused on monitoring specific services +or applications is a feature commonly found in observability solutions. These +features help day to day users to reduce learning curve and maintenance burden, +and allow knowledgeable users to share opinionated configuration that can be +quickly adopted. + +Three options are discussed on this RFC, they all come from previous existing +discussions. None of these options gained enough traction to reach a final +state. The expected outcome of this RFC is to decide on the approach to follow +for the implementation. + +## Terminology + +A note on terminology. We will use the term "modules" for this reusable +configuration on this RFC. This is a term used in the industry for this kind of +feature. Other name commonly used is "integrations", but this term is already +used in the OTel ecosystem. Previously considered alternatives use the term +"templates". We are not using this term here to avoid coupling this feature with +general templating. + +## Explanation + +From the user perspective, this feature should allow to use high-level modules +by their name and parameterize them with a set of variables. These modules will +contain receivers and processors configured for signal collection from an +specific service. Each module can contain any number of receivers, and any +number of pipelines including at least one processor. Pipelines will only define +receivers and processors, they won't define other components such as extensions +or exporters. + +It should be possible to configure the source of the modules. For example they +could be included in the configuration itself, in external files, in some hosted +service or in config maps. + +This feature should play well with autodiscovery features such as the receiver +creator. + +See [open-telemetry/opentelemetry-collector-contrib#36116](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/36116) +for more details from the user point of view. + +### Summary of user interface + +Configurations will be defined in modules as templates, these templates will +look pretty similar to usual OTel collector configurations, but they will only +allow the definition of receivers and processors, and pipelines containing them. +Specific templating language is TBD. + +Something like this: +``` +receivers: + prometheus/someservice: + config: + scrape_configs: + - job_name: 'someservice' + static_configs: + - targets: [${var:endpoint}] + basic_auth: + username: ${var:username} + password: ${var:password} + metric_relabel_configs: + ... +processors: + filter/something: + metrics: + exclude: + match_type: strict + metric_names: ... +pipelines: + metrics/somepipeline: + receiver: prometheus/someservice + processors: [filter/something] +``` + +In a configuration file, a module could be used with something like this: +``` +receivers: + module/somemodule: + name: somemodule + parameters: + endpoint: https://localhost:1234 + username: someuser + password: somepassword +... +service: + pipelines: + metrics: + receivers: [module/somemodule] + exporters: [...] +``` + +Processors pipelines could be used also independently, with something like this: +``` +processors: + module/somemodule: + name: somemodule + pipeline: metrics/somepipeline +... +service: + pipeline: + metrics: + receivers: [...] + processors: [module/somemodule] + exporters: [...] +``` + +A module can contain multiple pipelines, of multiple signal types. The +implementation should be aware of this and select the pipelines to create +depending on the type of pipeline. + +## Technical options + +### Option 1: Module components + +New receiver and processor components are implemented. They can instantiate the +pipelines defined in the modules internally, by calling the subcomponent factories +and chaining them with the provided consumers. + +This approach was originally proposed in [open-telemetry/opentelemetry-collector-contrib#26312](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/26312). + +#### Internal details + +The module components create their subcomponents on `Start()`, by getting the +factories from the `component.Host`. They keep record of all subcomponents so +they can be stopped on `Shutdown()`. + +Modules can be used this way in any place where the components can be used, what +in principle provides a more natural user experience. With this they also have +synergies with any feature that accepts them. For example the receiver +creator could use the module receiver directly, supporting autodiscovery use +cases. + +Module sources are provided via extensions, that components can use to discover +modules by their name. + +As all pieces are implemented as independent components, each of them can be +optionally used in distributions. + +POC for this approach is available in [elastic/opentelemetry-collector-components#96](https://github.com/elastic/opentelemetry-collector-components/pull/96) + +#### Trade-offs and mitigations + +Most trade-offs of this approach also exist on the receiver creator, that +uses a similar approach to create receivers. Mitigating them for one would +mitigate them for both. + +* Are factory getters always going to be available in the `component.Host`? They + are not in the current interface. +* Subcomponents are built on `Start()`, while components are usually created when + unmarshalling the configuration. +* Subcomponents are not available on the internal graph, so it is going to be + difficult to access the effective configuration. +* Module receiver also instantiates processors, this cannot be represented with + usual configuration without using connectors. + +Mitigating these trade-offs can be complex. They would imply making the +factories available to the factories themselves, and/or providing some internal +API for instantiating subcomponents while updating the internal graph. + +### Option 2: Module converter + +Modules can be used in the configuration as any other component. A new converter +is introduced to expand these modules while loading configuration. + +#### Internal details + +#### Trade-offs and mitigations + +### Option 3: Config processor / Recursive unmarshalling + +This is a follow-up of option 2. Modules can be used in the configuration as any +other component. A new extension point is added in the collector that allows +higher-level modification of the configuration as part of the unmarshalling +process. A new config processor is added to expand modules. + +#### Internal details + +#### Trade-offs and mitigations + +### Other options not considered on this RFC: + +Template provider https://github.com/open-telemetry/opentelemetry-collector/issues/8372 + +Combining configuration files https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/26312#issuecomment-1702391029 + +####### + +## Internal details + +From a technical perspective, how do you propose accomplishing the proposal? In particular, please explain: + +* How the change would impact and interact with existing functionality +* Likely error modes (and how to handle them) +* Corner cases (and how to handle them) + +While you do not need to prescribe a particular implementation - indeed, OTEPs should be about **behaviour**, not implementation! - it may be useful to provide at least one suggestion as to how the proposal *could* be implemented. This helps reassure reviewers that implementation is at least possible, and often helps them inspire them to think more deeply about trade-offs, alternatives, etc. + +## Trade-offs and mitigations + +What are some (known!) drawbacks? What are some ways that they might be mitigated? + +Note that mitigations do not need to be complete *solutions*, and that they do not need to be accomplished directly through your proposal. A suggested mitigation may even warrant its own OTEP! + +####### + +## Prior art and alternatives + +What are some prior and/or alternative approaches? For instance, is there a corresponding feature in OpenTracing or OpenCensus? What are some ideas that you have rejected? + +## Open questions + +What are some questions that you know aren't resolved yet by the OTEP? These may be questions that could be answered through further discussion, implementation experiments, or anything else that the future may bring. + +### What option should be implemented? + +### What templating language to use? + +### Definition of versioning, dependencies and other metadata? + +## Future possibilities + +What are some future changes that this proposal would enable? From ebf55891d222914be70251d52dadd04d0e0a36db Mon Sep 17 00:00:00 2001 From: Jaime Soriano Pastor Date: Wed, 6 Nov 2024 17:50:00 +0100 Subject: [PATCH 2/4] Additional options --- docs/rfcs/configuration-modules.md | 181 ++++++++++++++++++++++------- 1 file changed, 140 insertions(+), 41 deletions(-) diff --git a/docs/rfcs/configuration-modules.md b/docs/rfcs/configuration-modules.md index c64fae365b8..e6d595120b8 100644 --- a/docs/rfcs/configuration-modules.md +++ b/docs/rfcs/configuration-modules.md @@ -16,12 +16,18 @@ discussions. None of these options gained enough traction to reach a final state. The expected outcome of this RFC is to decide on the approach to follow for the implementation. +## Non-goals + +It is not a goal of this functionality to offer general templating for the OTel +collector configuration. Scenarios where general templating is needed are +probably better covered by configuration management tools. + ## Terminology A note on terminology. We will use the term "modules" for this reusable configuration on this RFC. This is a term used in the industry for this kind of feature. Other name commonly used is "integrations", but this term is already -used in the OTel ecosystem. Previously considered alternatives use the term +used in the OTel ecosystem. Previous discussions use the term "templates". We are not using this term here to avoid coupling this feature with general templating. @@ -40,7 +46,7 @@ could be included in the configuration itself, in external files, in some hosted service or in config maps. This feature should play well with autodiscovery features such as the receiver -creator. +creator, so it is possible to apply modules for autodiscovered loads. See [open-telemetry/opentelemetry-collector-contrib#36116](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/36116) for more details from the user point of view. @@ -110,12 +116,15 @@ service: exporters: [...] ``` -A module can contain multiple pipelines, of multiple signal types. The +A module can contain multiple pipelines of multiple signal types. The implementation should be aware of this and select the pipelines to create -depending on the type of pipeline. +depending on the type of the pipeline. ## Technical options +Some options are described here, and a general summary of pros and cons can be +found below, to help making decisions. + ### Option 1: Module components New receiver and processor components are implemented. They can instantiate the @@ -127,8 +136,8 @@ This approach was originally proposed in [open-telemetry/opentelemetry-collector #### Internal details The module components create their subcomponents on `Start()`, by getting the -factories from the `component.Host`. They keep record of all subcomponents so -they can be stopped on `Shutdown()`. +factories from the `component.Host`. They keep record of all their subcomponents +so they can be stopped on `Shutdown()`. Modules can be used this way in any place where the components can be used, what in principle provides a more natural user experience. With this they also have @@ -146,10 +155,6 @@ POC for this approach is available in [elastic/opentelemetry-collector-component #### Trade-offs and mitigations -Most trade-offs of this approach also exist on the receiver creator, that -uses a similar approach to create receivers. Mitigating them for one would -mitigate them for both. - * Are factory getters always going to be available in the `component.Host`? They are not in the current interface. * Subcomponents are built on `Start()`, while components are usually created when @@ -163,66 +168,160 @@ Mitigating these trade-offs can be complex. They would imply making the factories available to the factories themselves, and/or providing some internal API for instantiating subcomponents while updating the internal graph. +Most trade-offs of this approach also exist on the receiver creator, that +uses a similar approach to create receivers. Mitigating them for one would +mitigate them for both, and could also help in other features such as +configuration reload. + ### Option 2: Module converter -Modules can be used in the configuration as any other component. A new converter -is introduced to expand these modules while loading configuration. +Modules can be used in the configuration as any other component, but they don't +correspond to any actual component. A new converter is introduced to expand these +modules while loading configuration. #### Internal details -#### Trade-offs and mitigations +A new converter is introduced for templates expansion. It is executed as any other +converter when [resolving configuration](https://github.com/open-telemetry/opentelemetry-collector/blob/main/confmap/README.md#configuration-resolving). After it is resolved, it is unmarshalled as any other configuration. -### Option 3: Config processor / Recursive unmarshalling +The expansion process removes the modules and replaces them by receivers and/or +processors. New pipelines are added using the expanded receivers and processors, +using a forward connector as exporter. This connector is then used as receiver +in the pipelines defined in the configuration by the user. -This is a follow-up of option 2. Modules can be used in the configuration as any -other component. A new extension point is added in the collector that allows -higher-level modification of the configuration as part of the unmarshalling -process. A new config processor is added to expand modules. +Configuration of this feature can be done as a new top-level entry. This entry +needs to be unmarshaled and removed from the config by the converter. -#### Internal details +There is an implementation of this approach in [open-telemetry/opentelemetry-collector#8507](https://github.com/open-telemetry/opentelemetry-collector/pull/8507). #### Trade-offs and mitigations -### Other options not considered on this RFC: +* The forward connector is an additional dependency that distributions must include + for modules to work. This can be mitigated by documentation and runtime checks. +* To modify components and pipelines in the configuration, this converter needs + to be aware of the structure of the configuration. This is out of the scope + for a converter, that are more intended for small replacements not dependant + on the configuration format. This is mitigated by option 3, that introduces a + higher level approach for configuration processing. +* The converter needs to take care of unmarshalling its own configuration. This + would be also mitigated by option 3. +* Using modules with the autodiscovery features provided by the receiver creator + needs explicit support in the converter or in the receiver creator. -Template provider https://github.com/open-telemetry/opentelemetry-collector/issues/8372 +### Option 3: Config processor / Recursive unmarshalling -Combining configuration files https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/26312#issuecomment-1702391029 +This is a variation of option 2. Modules can be used in the configuration as any +other component. A new extension point is added in the collector that allows +higher-level modification of the configuration as part of the unmarshalling +process. A new config processor is added to expand modules. -####### +#### Internal details -## Internal details +A new extension point is added to the OTel collector, for config processors. +These processors are executed on partially parsed configuration, taking the +opportunity to modify any part of the configuration. After all the config +processors have been executed, a valid `otelcol.Config` must result. -From a technical perspective, how do you propose accomplishing the proposal? In particular, please explain: +This approach is described in [open-telemetry/opentelemetry-collector#8940](https://github.com/open-telemetry/opentelemetry-collector/issues/8940), and could be leveraged also in other requested features, such as the +one for [component groups](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/18509). -* How the change would impact and interact with existing functionality -* Likely error modes (and how to handle them) -* Corner cases (and how to handle them) +Once this extension point is added, a new config provider would be implemented +that would take care of expanding module subcomponents. Expansion would work in +a similar fashion to option 2, creating new pipelines and plugging them to the +pipelines in the configuration file using the forward connector. In contrast to +option 2, configuration unmarshalling would be provided by the collector. -While you do not need to prescribe a particular implementation - indeed, OTEPs should be about **behaviour**, not implementation! - it may be useful to provide at least one suggestion as to how the proposal *could* be implemented. This helps reassure reviewers that implementation is at least possible, and often helps them inspire them to think more deeply about trade-offs, alternatives, etc. +#### Trade-offs and mitigations -## Trade-offs and mitigations +* Potentially risky and complex implementation, as it is a significant change in + the unmarshalling process. The extension point will require its own + design and implementation process and in the meantime it can block progress on modules. + This could be mitigated by temporarily using the converter approach (option 2), and replace + it when the extension point is available. +* The forward connector is an additional dependency that distributions must include + for modules to work. This can be mitigated by documentation and runtime checks. +* Using modules with the autodiscovery features provided by the receiver creator + needs explicit support in the converter or in the receiver creator. + +### Summary of options + +Some of the decision points around which trade-offs orbit are the observability +options for the effective configuration, its integration with the receiver +creator for autodiscovery use cases or dynamic configuration in general, and +its user experience. + +Option 1, implementing modules as components, has the most straightforward +implementation and the most natural user experience, it also combines better +with existing features for receivers such as the receiver creator. On the other +hand, it has trade-offs that would be difficult to mitigate, specially about the +architecture used to instantiate subcomponents, and about observability of the +effective configuration. These trade-offs are shared though with the receiver +creator, and solving them could help in other areas as configuration reload. + +Option 2, the converter, could mimic a good user experience based also on +components, what would feel natural for final users. The internal graph would +have a representation of the effective configuration, what would help on +observability. But its architecture exceeds a bit the scope of a converter, it +requires an additional connector, and explicit implementations when combined +with other features such as the receiver creator. + +Option 3, the config processor, could also mimic a good user experience based on +components. The internal graph would also have a representation of the effective +configuration. Additionally, adding support for config processors would +introduce an interesting extension point that could help to solve other feature +requests. It has though a more complex implementation, more coupled to changes +in the core, and it also requires an additional connector and explicit +implementations to be combined with other features such as the receiver creator. -What are some (known!) drawbacks? What are some ways that they might be mitigated? +## Prior art and alternatives -Note that mitigations do not need to be complete *solutions*, and that they do not need to be accomplished directly through your proposal. A suggested mitigation may even warrant its own OTEP! +An alternative, proposed in [open-telemetry/opentelemetry-collector#8372](https://github.com/open-telemetry/opentelemetry-collector/issues/8372), could be to add a confmap provider that expands configuration files as templates. +This has the problem of coupling template functionality with configuration +sourcing, and it also has a less intuitive user experience than the other +options. Given these limitations, it is not discussed here. -####### +Also, without adding anything, the current config resolver is already able to +combine multiple configuration files. This approach is discussed in +[some comments](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/26312#issuecomment-1702391029) +in one of the previous issues discussing templates. Even when functional, this +approach depends on the order of configuration flags, and doesn't provide a +defined abstraction for sharing configuration, so we consider it would be better +to define this abstraction. -## Prior art and alternatives +## Open questions -What are some prior and/or alternative approaches? For instance, is there a corresponding feature in OpenTracing or OpenCensus? What are some ideas that you have rejected? +### What templating language to use? -## Open questions +On the described options we are not detailing the templating language to use. +They could work with different languages. We have to make a concious decision on +what templating language to use. Some options that have appeared in the +different discussions and POCs are: +* Go templates are used in different POCs, and are a natural option being a Go + project. It introduces though a new configuration language in the ecosystem, + and the template is not valid YAML itself. +* Use a confmap resolver with "sandboxed" providers. It has the advantage of + avoiding the inclusion of other templating languages, but it is not an actual + templating language. It can be used to replace variables, but it doesn't have + conditional logic or loops. We would need to confirm if this language is + enough. +* Receiver creator uses expvar for variable expansion. This could be another + option, already used in the ecosystem, but also limited for conditional logic and + loops. +* Supporting multiple templating languages. -What are some questions that you know aren't resolved yet by the OTEP? These may be questions that could be answered through further discussion, implementation experiments, or anything else that the future may bring. +### Definition of versioning, dependencies and other metadata? -### What option should be implemented? +Ideally, it should be possible to include some metadata in the templates. At +least we have to decide if templates should be versioned and how, and if their +component dependencies should be declared. -### What templating language to use? +### Development environment? -### Definition of versioning, dependencies and other metadata? +How could the development process of modules be? Where does their code +reside? How are they tested? ## Future possibilities -What are some future changes that this proposal would enable? +Having support for modules would help to have in the future a marketplace-like +site where users can easily obtain well-tested configurations prepared by +expert users. From 60bab0119c290a3b9c08ed17781e8b31432014d9 Mon Sep 17 00:00:00 2001 From: Jaime Soriano Pastor Date: Thu, 7 Nov 2024 12:05:39 +0100 Subject: [PATCH 3/4] Add references to how configuration would work --- docs/rfcs/configuration-modules.md | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/docs/rfcs/configuration-modules.md b/docs/rfcs/configuration-modules.md index e6d595120b8..77d0f00b19d 100644 --- a/docs/rfcs/configuration-modules.md +++ b/docs/rfcs/configuration-modules.md @@ -146,7 +146,13 @@ creator could use the module receiver directly, supporting autodiscovery use cases. Module sources are provided via extensions, that components can use to discover -modules by their name. +modules by their name. For example the following extension could be used to +provide templates from a local directory: +``` +extensions: + file_modules: + path: "./modules" +``` As all pieces are implemented as independent components, each of them can be optionally used in distributions. @@ -191,6 +197,11 @@ in the pipelines defined in the configuration by the user. Configuration of this feature can be done as a new top-level entry. This entry needs to be unmarshaled and removed from the config by the converter. +``` +modules: + path: "./modules" +``` +Different source implementations would need to be part of the converter itself. There is an implementation of this approach in [open-telemetry/opentelemetry-collector#8507](https://github.com/open-telemetry/opentelemetry-collector/pull/8507). @@ -231,6 +242,8 @@ a similar fashion to option 2, creating new pipelines and plugging them to the pipelines in the configuration file using the forward connector. In contrast to option 2, configuration unmarshalling would be provided by the collector. +Configuration of module sources would be done also the same way as in option 2. + #### Trade-offs and mitigations * Potentially risky and complex implementation, as it is a significant change in @@ -252,7 +265,8 @@ its user experience. Option 1, implementing modules as components, has the most straightforward implementation and the most natural user experience, it also combines better -with existing features for receivers such as the receiver creator. On the other +with existing features for receivers such as the receiver creator, and its +module sources can be implemented as independent components. On the other hand, it has trade-offs that would be difficult to mitigate, specially about the architecture used to instantiate subcomponents, and about observability of the effective configuration. These trade-offs are shared though with the receiver From 0db8c96aee641d8d6b676897a01a4126b7e1dcc2 Mon Sep 17 00:00:00 2001 From: Jaime Soriano Pastor Date: Fri, 8 Nov 2024 18:39:12 +0100 Subject: [PATCH 4/4] Apply feedback after internal review --- docs/rfcs/configuration-modules.md | 79 +++++++++++++++--------------- 1 file changed, 40 insertions(+), 39 deletions(-) diff --git a/docs/rfcs/configuration-modules.md b/docs/rfcs/configuration-modules.md index 77d0f00b19d..a5a90e22d98 100644 --- a/docs/rfcs/configuration-modules.md +++ b/docs/rfcs/configuration-modules.md @@ -20,16 +20,17 @@ for the implementation. It is not a goal of this functionality to offer general templating for the OTel collector configuration. Scenarios where general templating is needed are -probably better covered by configuration management tools. +probably better covered by configuration management tools. General templating +based on other tools could still be combined with configurations using modules. ## Terminology A note on terminology. We will use the term "modules" for this reusable configuration on this RFC. This is a term used in the industry for this kind of -feature. Other name commonly used is "integrations", but this term is already -used in the OTel ecosystem. Previous discussions use the term -"templates". We are not using this term here to avoid coupling this feature with -general templating. +feature. Other name commonly used is "integrations", but this term is [already +used](https://opentelemetry.io/ecosystem/integrations/) in the OTel ecosystem. +Previous discussions use the term "templates". We are not using this term here +to avoid assumptions related with general templating of configuration. ## Explanation @@ -38,15 +39,16 @@ by their name and parameterize them with a set of variables. These modules will contain receivers and processors configured for signal collection from an specific service. Each module can contain any number of receivers, and any number of pipelines including at least one processor. Pipelines will only define -receivers and processors, they won't define other components such as extensions +receivers and processors. Modules won't include other components such as extensions or exporters. It should be possible to configure the source of the modules. For example they could be included in the configuration itself, in external files, in some hosted -service or in config maps. +service or in K8s config maps. This feature should play well with autodiscovery features such as the receiver -creator, so it is possible to apply modules for autodiscovered loads. +creator, so it is possible to apply modules for autodiscovered loads. It can be +very handy when [defining configurations in annotations](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34427). See [open-telemetry/opentelemetry-collector-contrib#36116](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/36116) for more details from the user point of view. @@ -125,7 +127,7 @@ depending on the type of the pipeline. Some options are described here, and a general summary of pros and cons can be found below, to help making decisions. -### Option 1: Module components +### Option 1 (Recommended): Module components New receiver and processor components are implemented. They can instantiate the pipelines defined in the modules internally, by calling the subcomponent factories @@ -145,7 +147,7 @@ synergies with any feature that accepts them. For example the receiver creator could use the module receiver directly, supporting autodiscovery use cases. -Module sources are provided via extensions, that components can use to discover +Module sources are provided via extensions that components can use to discover modules by their name. For example the following extension could be used to provide templates from a local directory: ``` @@ -162,7 +164,8 @@ POC for this approach is available in [elastic/opentelemetry-collector-component #### Trade-offs and mitigations * Are factory getters always going to be available in the `component.Host`? They - are not in the current interface. + are not in the current interface and could be a blocker if there are plans to + remove them. * Subcomponents are built on `Start()`, while components are usually created when unmarshalling the configuration. * Subcomponents are not available on the internal graph, so it is going to be @@ -208,7 +211,7 @@ There is an implementation of this approach in [open-telemetry/opentelemetry-col #### Trade-offs and mitigations * The forward connector is an additional dependency that distributions must include - for modules to work. This can be mitigated by documentation and runtime checks. + for modules to work. This can be mitigated by documentation and/or runtime checks. * To modify components and pipelines in the configuration, this converter needs to be aware of the structure of the configuration. This is out of the scope for a converter, that are more intended for small replacements not dependant @@ -218,6 +221,9 @@ There is an implementation of this approach in [open-telemetry/opentelemetry-col would be also mitigated by option 3. * Using modules with the autodiscovery features provided by the receiver creator needs explicit support in the converter or in the receiver creator. +* Configuring modules [from k8s annotations](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34427#issuecomment-2413401158) may not be possible. + As this configuration would not be known when unmarshalling the collector + configuration. This can be difficult to mitigate with this option. ### Option 3: Config processor / Recursive unmarshalling @@ -233,16 +239,18 @@ These processors are executed on partially parsed configuration, taking the opportunity to modify any part of the configuration. After all the config processors have been executed, a valid `otelcol.Config` must result. -This approach is described in [open-telemetry/opentelemetry-collector#8940](https://github.com/open-telemetry/opentelemetry-collector/issues/8940), and could be leveraged also in other requested features, such as the +This approach is described in [open-telemetry/opentelemetry-collector#8940](https://github.com/open-telemetry/opentelemetry-collector/issues/8940), +and could be leveraged also in other requested features, such as the one for [component groups](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/18509). Once this extension point is added, a new config provider would be implemented that would take care of expanding module subcomponents. Expansion would work in a similar fashion to option 2, creating new pipelines and plugging them to the pipelines in the configuration file using the forward connector. In contrast to -option 2, configuration unmarshalling would be provided by the collector. +option 2, configuration unmarshalling would be provided by the collector config +provider. -Configuration of module sources would be done also the same way as in option 2. +Configuration of module sources would be done the same way as in option 2. #### Trade-offs and mitigations @@ -255,37 +263,30 @@ Configuration of module sources would be done also the same way as in option 2. for modules to work. This can be mitigated by documentation and runtime checks. * Using modules with the autodiscovery features provided by the receiver creator needs explicit support in the converter or in the receiver creator. +* Configuring modules [from k8s annotations](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34427#issuecomment-2413401158) may not be possible. + As this configuration would not be known when unmarshalling the collector + configuration. This can be difficult to mitigate with this option. ### Summary of options -Some of the decision points around which trade-offs orbit are the observability +Some of the decision points which trade-offs orbit around are the observability options for the effective configuration, its integration with the receiver creator for autodiscovery use cases or dynamic configuration in general, and its user experience. -Option 1, implementing modules as components, has the most straightforward -implementation and the most natural user experience, it also combines better -with existing features for receivers such as the receiver creator, and its -module sources can be implemented as independent components. On the other -hand, it has trade-offs that would be difficult to mitigate, specially about the -architecture used to instantiate subcomponents, and about observability of the -effective configuration. These trade-offs are shared though with the receiver -creator, and solving them could help in other areas as configuration reload. - -Option 2, the converter, could mimic a good user experience based also on -components, what would feel natural for final users. The internal graph would -have a representation of the effective configuration, what would help on -observability. But its architecture exceeds a bit the scope of a converter, it -requires an additional connector, and explicit implementations when combined -with other features such as the receiver creator. - -Option 3, the config processor, could also mimic a good user experience based on -components. The internal graph would also have a representation of the effective -configuration. Additionally, adding support for config processors would -introduce an interesting extension point that could help to solve other feature -requests. It has though a more complex implementation, more coupled to changes -in the core, and it also requires an additional connector and explicit -implementations to be combined with other features such as the receiver creator. +We recommend Option 1 because it provides a more natural user experience, based +on optional components, and it works better when combined with the receiver +creator, what can be important in autodiscovery use cases. Using Options 2 and 3 +with the receiver creator will require additional developments, and may be +challenging to support configurations [based on annotations](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34427#issuecomment-2413401158), as these annotations +cannot be known when unmarshalling the collector configuration. + +The main reasons to avoid Option 1 are that it instantiates components on `Start()`, +what could be breaking architectural boundaries, and that it would be difficult to +give visibility on the "effective configuration". Though these trade-offs are also +present to some point in the receiver creator, that is already used in the OTel +collector ecosystem, and we consider them less relevant than the usability +trade-offs of Options 2 and 3, ## Prior art and alternatives