[target-allocator] fix updating scrape configs #1620

swiatekm · 2023-04-03T11:53:47Z

Only take scrape configs from the most recently applied config, and only save them if we've successfully updated. As a result, we now correctly get rid of targets after a ServiceMonitor or PodMonitor is deleted, fixing #1415.

I've also fixed an issue where we would overwrite the job to scrapeConfig mapping even if the actual update failed - the test for this was incorrect, as it didn't deepcopy the mapping, and basically asserted that two references to the same map were equal.

I've modified the existing discovery test to look at scrape config updates, and added another job to the test data to make sure it gets deleted after an update.

I've also done a proper manual e2e test on this, and it worked as expected.

cmd/otel-allocator/target/discovery.go

jaronoff97 · 2023-04-10T15:14:14Z

cmd/otel-allocator/target/discovery.go

@@ -67,31 +67,33 @@ func NewDiscoverer(log logr.Logger, manager *discovery.Manager, hook discoveryHo

 func (m *Discoverer) ApplyConfig(source allocatorWatcher.EventSource, cfg *config.Config) error {
 	m.configsMap[source] = cfg
+	newJobToScrapeConfig := make(map[string]*config.ScrapeConfig)


i'm worried about the memory implications of doing this. I think rather than doing it this way we should be listening on the events we receive from the kubernetes API for delete/update and use them to do the desired change here. Right now we just call it an event and then regrab everything, when it would probably be more efficient to just use what Kubernetes gives us.

That sounds like a much larger refactor that should have its own issue, which I'd rather not mix with this (fairly straightforward) bug fix. I can change this to pre-allocate based on the length of the existing map and run some kind of benchmark on this, but intuitively it doesn't seem like deal breaker - populating a map from (small) string to pointer should be fast even at 1000s of elements.

a benchmark would definitely be helpful here – we have some existing benchmarks in the repo that you should be able to run iirc. cc @kristinapathak what're your thoughts on this? The worry i have is churning a map which contains some pretty beefy configs at unknown intervals.

Correct me if I'm misunderstanding your point, but we don't actually allocate anything for the configs themselves here, just pointers to them. At the point where this update happens, those have already been allocated. What this change adds is additional deallocations for configs we don't need anymore, which would've normally stayed in this map forever.

To be clear, I think your point that doing this may be excessive is valid, but this bugfix doesn't change this behaviour much.

I think this map is pretty small given its value objects are pointers, and the bug of a map growing unbounded seems worse than this fix.

Only take scrape configs from the most recently applied config, and only save them if we've successfully updated.

swiatekm · 2023-04-13T08:55:00Z

@jaronoff97 I added a simple benchmark for the affected function. Performance is slightly lower, but still less than 100ns per call on a 1000 element map.

Before the change:

BenchmarkApplyScrapeConfig-16    	   12638	     95134 ns/op	    8328 B/op	     472 allocs/op

After the change:

BenchmarkApplyScrapeConfig-16    	   11508	     98086 ns/op	    8584 B/op	     474 allocs/op

jaronoff97 · 2023-04-13T14:36:43Z

thanks so much for running the benchmark. Going to let the tests the run, then review again :)

* [target-allocator] fix updating scrape configs Only take scrape configs from the most recently applied config, and only save them if we've successfully updated. * [target-allocator] drop unnecessary job to scrape config map * [tatget-allocator] add discoverer benchmark

swiatekm requested a review from a team April 3, 2023 11:53

swiatekm force-pushed the fix/targetallocator/update-scrape-configs branch from c4c0d82 to 45996a9 Compare April 3, 2023 15:28

swiatekm requested a review from a team April 3, 2023 15:28

jaronoff97 reviewed Apr 10, 2023

View reviewed changes

swiatekm force-pushed the fix/targetallocator/update-scrape-configs branch from 27bdc2a to 611e5e7 Compare April 11, 2023 08:39

Mikołaj Świątek added 3 commits April 13, 2023 10:49

[target-allocator] fix updating scrape configs

48ea0ab

Only take scrape configs from the most recently applied config, and only save them if we've successfully updated.

[target-allocator] drop unnecessary job to scrape config map

b324939

[tatget-allocator] add discoverer benchmark

5dc5e72

swiatekm force-pushed the fix/targetallocator/update-scrape-configs branch from 611e5e7 to 5dc5e72 Compare April 13, 2023 08:49

swiatekm requested review from jaronoff97 and kristinapathak April 13, 2023 08:55

kristinapathak approved these changes Apr 13, 2023

View reviewed changes

jaronoff97 approved these changes Apr 13, 2023

View reviewed changes

jaronoff97 merged commit 20ea43a into open-telemetry:main Apr 13, 2023

swiatekm deleted the fix/targetallocator/update-scrape-configs branch April 13, 2023 19:15

swiatekm mentioned this pull request Apr 13, 2023

[target-allocator] With prometheusCR enabled, removal of service- or podmonitors is not correctly reflected in the scrape configs handler #1415

Closed

swiatekm mentioned this pull request May 26, 2023

REQUEST: New membership for @swiatekm-sumo open-telemetry/community#1512

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[target-allocator] fix updating scrape configs #1620

[target-allocator] fix updating scrape configs #1620

swiatekm commented Apr 3, 2023 •

edited

Loading

jaronoff97 Apr 10, 2023

swiatekm Apr 10, 2023

jaronoff97 Apr 10, 2023

swiatekm Apr 10, 2023

kristinapathak Apr 11, 2023 •

edited

Loading

swiatekm commented Apr 13, 2023

jaronoff97 commented Apr 13, 2023

[target-allocator] fix updating scrape configs #1620

[target-allocator] fix updating scrape configs #1620

Conversation

swiatekm commented Apr 3, 2023 • edited Loading

jaronoff97 Apr 10, 2023

Choose a reason for hiding this comment

swiatekm Apr 10, 2023

Choose a reason for hiding this comment

jaronoff97 Apr 10, 2023

Choose a reason for hiding this comment

swiatekm Apr 10, 2023

Choose a reason for hiding this comment

kristinapathak Apr 11, 2023 • edited Loading

Choose a reason for hiding this comment

swiatekm commented Apr 13, 2023

jaronoff97 commented Apr 13, 2023

swiatekm commented Apr 3, 2023 •

edited

Loading

kristinapathak Apr 11, 2023 •

edited

Loading