[processor/deltatocumulative]: expire stale series #31337

sh0rez · 2024-02-20T11:12:43Z

Description: Removes stale series from tracking (and thus frees their memory) using staleness logic from #31089

Link to tracking Issue: #30705, #31016

Testing: TestExpiry
Documentation: README updated

uses the staleness tracking map. for now does not actually expire entries, until lifecycle management is in place

adds mutex-sync to the staleness struct. this is especially important for item expiration, which modifies the pq and map serially. intermittent changes by other routines would lead to data inconsistencies here.

sh0rez · 2024-02-20T11:13:53Z

@RichieSams please take a look at 7745b7e, had to add mutex-sync to your staleness code.
I think its worth already picking that into your bring-up PR

converts from Start() methods to context based cancellation

RichieSams · 2024-02-20T12:30:10Z

@RichieSams please take a look at 7745b7e, had to add mutex-sync to your staleness code. I think its worth already picking that into your bring-up PR

IMO the mutex should be outside the structure. Taking a lock for each and every operation will get expensive fast, for the amount of data entrypoints that the processors will handle. IE, the user should take a lock once, do all the operations, and then release the lock.

RichieSams · 2024-02-20T12:31:18Z

I do like your decision to pull the Map instance into a private variable and not expose it. That should prevent users from accidentally using it directly.

sh0rez · 2024-02-20T14:57:03Z

@RichieSams I agree. The reason for the extensive locking here is to synchronize with the out-of-band "pruning" routine, which needs to have exclusive ownership while it runs. I do have a new idea however how to separate those, will try it out 👍

RichieSams · 2024-02-20T15:57:55Z

I need to update my PR with the new Stateness structs. But in my initial implementation, I just had a mutex in the Processor. ConsumeMetrics() takes it at the start. https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/30827/files#diff-677a13ed5d6a215261ff4e6a8a715738563e28501eb80ac0691352e1f8f316ebR31

You could either call ExpireOldEntries() at the beginning of ConsumeMetrics() or use a time.Ticker() and a goroutine to periodically call it. (would take the lock before the call)

sh0rez · 2024-02-20T16:03:35Z

@RichieSams separated things here. some notes on my design:

essentially reverted my changes to your code, staleness is as it was before
moved locking + expiry to this processors own internal, as it's clearly too experimental to be shared yet
took the opportunity to not do a "dumb ticker", but instead sleep precisely until the earliest time a series may go stale
only calling in ConsumeMetrics would prevent series from going stale when no new samples are ingested, so we also need the ticker

In regard to locking, my current architecture takes global locks anyways. My approach is to first come up with a safe, correct but possibly slow implementation and optimize later, if required.

sh0rez · 2024-02-28T17:20:06Z

@RichieSams @djaglowski @jpkrohling please take a look :)

processor/deltatocumulativeprocessor/internal/streams/mutex.go

processor/deltatocumulativeprocessor/internal/delta/delta.go

internal/exp/metrics/staleness/staleness.go

internal/exp/metrics/staleness/map.go

processor/deltatocumulativeprocessor/internal/streams/expiry.go

Moves all locking to the upper level, resulting a much clearer overall picture. Refactors delta.Accumulator to the streams.Map interface, allowing higher composability

sh0rez · 2024-03-01T16:05:47Z

@RichieSams that was really good feedback!

I moved locking centrally into the highest level processor. Stale expiry is also called from there, leading to a very clear locking flow without hidden complexity.

Also refactored the delta.Accumulator to be a streams.Map itself (with the Store operation doing the accumulation), making it possible to wrap the Accumulator in Staleness, and further composability in the future.

Tests expiry functionality under real conditions, using a faked clock

sh0rez · 2024-03-04T11:52:56Z

@RichieSams @djaglowski I think this is ready for a final review. Added docs and a thorough test-case.

djaglowski · 2024-03-04T19:52:10Z

Code LGTM

jpkrohling · 2024-03-05T10:47:11Z

I guess the new release caused conflicts with this one. Once the conflicts are resolved, I'll merge this.

RichieSams · 2024-03-05T13:12:52Z

internal/exp/metrics/staleness/staleness.go

 	for {
+		if s.Len() == 0 {


RichieSams · 2024-03-05T13:31:30Z

processor/deltatocumulativeprocessor/internal/streams/expiry.go

+		case n == 0:
+			<-e.sig
+		case n > 0:
+			expires := e.Staleness.Next().Add(e.Max)


This has the potential of not being protected by the lock. Since you release the lock on return of this function, but this is a go-routine.

IMO, this whole struct could go away, and the Processor could keep Staleness structs at the top level instead of maps. While "elegant", I'm not sure the elegance of sleeping "exactly" to the next expiry outweighs the potential bugs like this race, and the extra logic you had to add to Store() to make sure you can preemptively wake things up.

The handling of n and calculation of expires needs to happen outside the goroutine, so it's protected by the lock

My vote would be to replace this logic with a simple Ticker() every say 5 minutes. (We could make that a config value if want)

that's a really good catch and it says quite something about complexity that no one noticed it before.

The issue I see with a timer is that streams possibly live up to 2*max-1 seconds if the timer hits in a bad moment

I believe it would be max+tickerInterval seconds. IMO in most real-world scenarios max will be >> the ticker value. IE, max is 1 hour, and ticker is 5 minutes. So a series lasting 1h 4.9999min isn't the end of the world.

There is precedent for this kind of expiry schema in other metrics tools, like mtail

We could allow the tickerInterval to be configurable. And add config validation that max > tickerInterval. With a big doc / warning that says that tickerInterval should be much smaller than max to ensure a bad lineup of the tick doesn't allow series to last too long

RichieSams · 2024-03-05T13:35:07Z

processor/deltatocumulativeprocessor/internal/streams/expiry.go

+
+	// "try-send" to notify possibly sleeping expiry routine
+	select {
+	case e.sig <- struct{}{}:


This means that for every store, we're potentially starting a new "sleep" go-routine for expiry. Lots of stores quickly in a row could build up many go-routines that are all sleeping. Since each loop of creates a new one.

are you sure?

afaict there is only one routine (started in Processor.Start). the linked code here is a try-send operation that either sends an empty struct into the channel (if the other routine happens to listen) or just does nothing at all

In processor.Start() we launch a goroutine that does this:

for { p.mtx.Lock() next := p.exp.ExpireOldEntries() p.mtx.Unlock() select { case <-next: case <-p.ctx.Done(): return } }

ExpireOldEntries() is overridden above. It starts a goroutine to do the sleep. That's the goroutine I'm talking about.

By sending a struct to the channel, you cause next to return early. IE, it doesn't wait for the sleep to finish / the channel to be closed. Which is the behaviour you want, but it also abandons that goroutine. The code will immediately loop and call ExpireOldEntries() again, which spawns a new goroutine. Etc.

RichieSams · 2024-03-05T13:36:03Z

Sorry for the delayed final review; we got a puppy this weekend that's been eating all my time 😁

I added some small comments about the expiry and a race condition. Otherwise, LGTM

) **Description:** Removes stale series from tracking (and thus frees their memory) using staleness logic from open-telemetry#31089 **Link to tracking Issue:** open-telemetry#30705, open-telemetry#31016 **Testing:** `TestExpiry` **Documentation:** README updated

sh0rez added 6 commits February 19, 2024 16:36

deltatocumulative: use exp/metrics/identity

e2d6090

deltatocumulative: use exp/metrics/streams.Map

3657ffa

*: cherry-pick staleness

d596be2

deltatocumulative: staleness map

1d98dca

uses the staleness tracking map. for now does not actually expire entries, until lifecycle management is in place

metrics/staleness: synchronization

7745b7e

adds mutex-sync to the staleness struct. this is especially important for item expiration, which modifies the pq and map serially. intermittent changes by other routines would lead to data inconsistencies here.

deltatocumulative: expire stale series

502f602

github-actions bot added internal/exp/metrics processor/deltatocumulative labels Feb 20, 2024

sh0rez mentioned this pull request Feb 20, 2024

[processor/deltatocumulative]: progress tracking #30705

Closed

26 tasks

deltatocumulative: context lifecycle

4fcb984

converts from Start() methods to context based cancellation

stale: separate tracking, expiry and locking

479c2db

sh0rez added 2 commits February 28, 2024 15:38

*: forward to main

d13d239

*: post-merge fixes

fdc6dd5

github-actions bot requested a review from RichieSams February 28, 2024 14:42

sh0rez marked this pull request as ready for review February 28, 2024 14:46

sh0rez requested review from a team and andrzej-stencel February 28, 2024 14:46

github-actions bot assigned evan-bradley Feb 28, 2024

sh0rez added 5 commits February 28, 2024 16:59

*: forward to main

c7a21d0

staleness: remove errornous time.Now

613e62c

*: linter fixes

1f7d219

*: docs

764b61c

*: linter fixes

031543a

sh0rez mentioned this pull request Feb 28, 2024

[processor/deltatocumulative] limit tracked streams #31488

Merged

*: changelog

918fd06