Implement rate limiting #256

Teneroy · 2024-11-13T00:30:33Z

Reasoning and Description:
In order to prevent high spendings and hitting AI core rate limiting, we need to introduce our own rate limiting per cluster which uses our companion.

Tasks:

Make sure our traces in Langfuse have identifiers of clusters which uses our companion
Whenever we receive a request we need to:
Pull number of tokens used by cluster from 00:00UTC to 23:59UTC (possible through Langfuse API) or within 24h period(whatever is easier).
If the number of tokens is higher than a constant we set, return a message to the user that they over consumed their token usage and should come back after 23:59UTC or after X minutes(if we go with 24h approach)

Acceptance criteria:

Traces are being identified per cluster
Total token usage is being pulled within agreed time range and compared against the constant

Teneroy · 2024-11-13T12:42:16Z

https://api.reference.langfuse.com/#get-/api/public/observations

Teneroy added this to the Develop core agent functionality milestone Nov 13, 2024

Teneroy mentioned this issue Nov 27, 2024

Epic: Develop Beta of Kyma Companion #275

Open

23 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement rate limiting #256

Implement rate limiting #256

Teneroy commented Nov 13, 2024 •

edited

Loading

Teneroy commented Nov 13, 2024

Implement rate limiting #256

Implement rate limiting #256

Comments

Teneroy commented Nov 13, 2024 • edited Loading

Teneroy commented Nov 13, 2024

Teneroy commented Nov 13, 2024 •

edited

Loading