Add feature flag for target allocator config addition #1688

swiatekm · 2023-04-28T09:01:54Z

These are changes from #1557, rebased on main and using the feature flag machinery from #1619.

One addition I've made is clearing the scrape configs in prometheus receiver itself. This doesn't actually change the semantics, but it makes it clearer what is actually going on. The change is in a separate commit here: 15ac0ce.

I've done some manual E2E testing on these changes and everything worked as expected.

Fixes #1581

TylerHelmuth · 2023-04-28T15:24:53Z

pkg/featuregate/featuregate.go

+	// EnableTargetAllocatorRewrite is the feature gate that controls whether the collector's configuration should
+	// automatically be rewritten when the target allocator is enabled.
+	EnableTargetAllocatorRewrite = featuregate.GlobalRegistry().MustRegister(
+		"operator.enableTargetAllocatorRewrite",


Sticking with the hierarchical pattern, how do you feel about operator.targetallocator.rewrite?

That makes it sound like this changes target allocator behaviour, but in reality we're changing how the operator rewrites prometheusreceiver config. So maybe something like operator.prometheusreceiver.rewritetargetallocator? Kind of a mouthful, but more descriptive imo.

Since the flag ultimately affects the collector how about operator.collector.rewritetargetallocator. I think that would scale well with any other flags that mess with collector configuration/capabilities.

Yeah, that sounds better.

jaronoff97 · 2023-05-01T17:57:27Z

pkg/collector/reconcile/configmap_test.go

+        scrape_interval: 1m
+        scrape_timeout: 10s
+        evaluation_interval: 1m
+    target_allocator:


do we have an explicit case for when the flag is disabled?

It's disabled in previous tests, but only implicitly. Think it's worth disabling it explicitly there?

hmmm... I think it's alright then, when the flag is moved to beta we'll just do the reverse.

jaronoff97 · 2023-05-01T17:57:42Z

pkg/featuregate/featuregate.go

+
+	// EnableTargetAllocatorRewrite is the feature gate that controls whether the collector's configuration should
+	// automatically be rewritten when the target allocator is enabled.
+	EnableTargetAllocatorRewrite = featuregate.GlobalRegistry().MustRegister(


Can you also update the README with this?

jaronoff97 · 2023-05-01T18:00:49Z

pkg/targetallocator/adapters/config_to_prom_config.go

@@ -28,7 +28,7 @@ func errorNotAMap(component string) error {
 	return fmt.Errorf("%s property in the configuration doesn't contain valid %s", component, component)
 }

-// ConfigToPromConfig converts the incoming configuration object into a the Prometheus receiver config.
+// ConfigToPromConfig converts the incoming configuration object into the Prometheus receiver config.
 func ConfigToPromConfig(cfg string) (map[interface{}]interface{}, error) {


This is technically a breaking change, could we mark that in the chloggen?

I think I'd rather properly fix this check, as it's currently incorrect. Right now, it requires that the config property exists, but in reality we should check if one of the following is true:

config exists

target_allocator exists and target allocator is enabled on the resource

target allocator is enabled on the resource and the feature flag is enabled

Then this is really a fix, not a breaking change, and the actual breaking change will happen when the flag is enabled by default. How's that sound?

yes, technically the comment on the function doesn't match what the code is actually doing, but regardless anyone potentially using this function and expecting it to return the prometheus config blob would be broken by this change. I think i'm okay to call this a 'fix', as long as we call that out in the changelog's subtext – in case someone is using this for some reason they should be able to see the difference in the release notes.

For now I just added a note to the changelog about the function's semantics changing. For the validation, I took a stab at adding it, and it's a large enough change that I think it should go into a separate PR. You reckon we can merge this as is, and do validation afterwards? Can also go the other way around, the validation changes are mostly independent of this PR.

jaronoff97

One more Q: were you able to test this with a real cluster and CRD to be sure it works as expected E2E?

jaronoff97 · 2023-05-02T14:40:54Z

pkg/collector/reconcile/configmap_test.go

+        scrape_interval: 1m
+        scrape_timeout: 10s
+        evaluation_interval: 1m
+    target_allocator:


hmmm... I think it's alright then, when the flag is moved to beta we'll just do the reverse.

jaronoff97 · 2023-05-02T14:43:13Z

pkg/collector/reconcile/config_replace.go

+			CollectorID: "${POD_NAME}",
+		}
+		// we don't need the scrape configs here anymore with target allocator enabled
+		cfg.PromConfig.ScrapeConfigs = []*promconfig.ScrapeConfig{}


I actually think the collector may fail to startup if it doesn't have any scrape configs set, at least that was the case a few months ago...

oh ha, nvm i fixed this a few months ago (link)

Yeah, I actually tested this E2E, no issues.

jaronoff97 · 2023-05-02T14:47:05Z

pkg/targetallocator/adapters/config_to_prom_config.go

@@ -28,7 +28,7 @@ func errorNotAMap(component string) error {
 	return fmt.Errorf("%s property in the configuration doesn't contain valid %s", component, component)
 }

-// ConfigToPromConfig converts the incoming configuration object into a the Prometheus receiver config.
+// ConfigToPromConfig converts the incoming configuration object into the Prometheus receiver config.
 func ConfigToPromConfig(cfg string) (map[interface{}]interface{}, error) {


yes, technically the comment on the function doesn't match what the code is actually doing, but regardless anyone potentially using this function and expecting it to return the prometheus config blob would be broken by this change. I think i'm okay to call this a 'fix', as long as we call that out in the changelog's subtext – in case someone is using this for some reason they should be able to see the difference in the release notes.

swiatekm · 2023-05-05T16:59:21Z

One more Q: were you able to test this with a real cluster and CRD to be sure it works as expected E2E?

Yeah. For reference, I'm adding experimental support for replacing Prometheus with otel to our Helm Chart here: SumoLogic/sumologic-kubernetes-collection#2988, and I ran our integration tests against the changes in this PR.

jaronoff97 · 2023-05-05T17:17:08Z

README.md

+#### Target Allocator config rewriting
+
+Prometheus receiver now has explicit support for acquiring scrape targets from the target allocator. As such, it is now possible to have the
+Operator add the necessary target allocator configuration automatically. This feature currently requires the `operator.collector.rewritetargetallocator` feature flag to be enabled. With the flag enabled, the configuration from the previous section would be rendered as:


note for @TylerHelmuth in a follow up to this we should move the feature flag section to its own thing to include this.

…#1688) * add feature flag for allocator config rewrite * clear Prometheus scrape configs if allocator rewrite enabled * add changelog entry * rename flag to operator.collector.rewritetargetallocator * fix changelog entry * document the ewritetargetallocator flag --------- Co-authored-by: Jacob Aronoff <[email protected]>

swiatekm force-pushed the otelallocator-flag branch from 15ac0ce to b3a56ab Compare April 28, 2023 09:42

swiatekm marked this pull request as ready for review April 28, 2023 10:09

swiatekm requested a review from a team April 28, 2023 10:09

TylerHelmuth reviewed Apr 28, 2023

View reviewed changes

jaronoff97 requested changes May 1, 2023

View reviewed changes

jaronoff97 reviewed May 2, 2023

View reviewed changes

swiatekm requested a review from jaronoff97 May 5, 2023 16:56

swiatekm force-pushed the otelallocator-flag branch from 5727d1d to 80173ff Compare May 5, 2023 16:57

jaronoff97 and others added 6 commits May 5, 2023 18:57

add feature flag for allocator config rewrite

de847bb

clear Prometheus scrape configs if allocator rewrite enabled

55ba66a

add changelog entry

b8d4e3d

rename flag to operator.collector.rewritetargetallocator

061b557

fix changelog entry

73f10ec

document the ewritetargetallocator flag

80173ff

jaronoff97 reviewed May 5, 2023

View reviewed changes

jaronoff97 approved these changes May 5, 2023

View reviewed changes

jaronoff97 merged commit a74dc8e into open-telemetry:main May 5, 2023

swiatekm deleted the otelallocator-flag branch May 5, 2023 20:04

swiatekm mentioned this pull request May 10, 2023

Improve config validation for prometheus receiver and target allocator #1729

Merged

swiatekm mentioned this pull request May 26, 2023

REQUEST: New membership for @swiatekm-sumo open-telemetry/community#1512

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add feature flag for target allocator config addition #1688

Add feature flag for target allocator config addition #1688

swiatekm commented Apr 28, 2023 •

edited

Loading

TylerHelmuth Apr 28, 2023

swiatekm Apr 28, 2023

TylerHelmuth Apr 28, 2023

swiatekm Apr 28, 2023

swiatekm May 1, 2023

jaronoff97 May 1, 2023

swiatekm May 2, 2023

jaronoff97 May 2, 2023

jaronoff97 May 1, 2023

swiatekm May 5, 2023

jaronoff97 May 1, 2023

swiatekm May 2, 2023

jaronoff97 May 2, 2023

swiatekm May 5, 2023

jaronoff97 left a comment

jaronoff97 May 2, 2023

jaronoff97 May 2, 2023

jaronoff97 May 2, 2023

swiatekm May 2, 2023

jaronoff97 May 2, 2023

swiatekm commented May 5, 2023

jaronoff97 May 5, 2023

Add feature flag for target allocator config addition #1688

Add feature flag for target allocator config addition #1688

Conversation

swiatekm commented Apr 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jaronoff97 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swiatekm commented May 5, 2023

Choose a reason for hiding this comment

swiatekm commented Apr 28, 2023 •

edited

Loading