You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The distributor ingestion rate limit increases the number of "consumed tokens" in the rate limiter once the request is received and before writing to ingesters:
In the event of an ingesters outage (eg. 2+ ingesters are unavailable), this means that each tenant remote write request will consume tokens from its rate limiter even if samples have not been successfully ingested. The client (eg. Prometheus) will retry writes and this will further consume tokens from the rate limiter, until it will eventually hit the rate limit, regardless any samples has been actually ingested.
The burst should protect from this, but in the event of a relatively long outage we would end up consuming the burst too (eg. we set burst to 10x the rate limit).
I'm wondering if a better approach would be checking if enough tokens are still available in the rate limiter once the request is received but actually consuming them from the rate limiter only after samples have been successfully written to ingesters. Due to concurrency, the actual accepted rate could be higher than the limit, but we would err in favour of the customer instead of rate limiting for writes we haven't actually ingested.
I think the rate-limiter package has a solution for this.
Instead of calling AllowN(), call ReserveN() and check .OK() to see if within rate limit.
Then if the operation fails before ingestion call Cancel() on the Reservation.
It appears that ReserveN+OK is not directly equivalent to AllowN, but close enough that we can use it. AllowN essentially also performs a check to ensure the rate limit is complied with immediately, where ReserveN will return OK even if the tokens are not available until some delay has passed, so we just have to check this is zero if we want the same behaviour.
The distributor ingestion rate limit increases the number of "consumed tokens" in the rate limiter once the request is received and before writing to ingesters:
cortex/pkg/distributor/distributor.go
Line 581 in 527f9b5
In the event of an ingesters outage (eg. 2+ ingesters are unavailable), this means that each tenant remote write request will consume tokens from its rate limiter even if samples have not been successfully ingested. The client (eg. Prometheus) will retry writes and this will further consume tokens from the rate limiter, until it will eventually hit the rate limit, regardless any samples has been actually ingested.
The burst should protect from this, but in the event of a relatively long outage we would end up consuming the burst too (eg. we set burst to 10x the rate limit).
I'm wondering if a better approach would be checking if enough tokens are still available in the rate limiter once the request is received but actually consuming them from the rate limiter only after samples have been successfully written to ingesters. Due to concurrency, the actual accepted rate could be higher than the limit, but we would err in favour of the customer instead of rate limiting for writes we haven't actually ingested.
Related discussions:
The text was updated successfully, but these errors were encountered: