Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus push gateway support #20

Open
CHarnel opened this issue Nov 14, 2021 · 5 comments
Open

Prometheus push gateway support #20

CHarnel opened this issue Nov 14, 2021 · 5 comments

Comments

@CHarnel
Copy link
Member

CHarnel commented Nov 14, 2021

Let's consider adding support for prometheus pushgateway (prom client docs).

@gioragutt
Copy link
Contributor

What are our expected use cases for push gateway?

https://prometheus.io/docs/practices/pushing/

The document here describes a little about when push gateway is recommended (which is, most of the time, not).

So I think that before this issue is addressed, we need to understand under which circumstances such a plugin will be used so that we can know how to handle the lifecycle management-related topics.

For reference, here's a paragraph from the document explaining some of the fundamental differences between a normal service using Prometheus (with normal pull behavior) to using Pushgateway:

(*) The Pushgateway never forgets series pushed to it and will expose them to Prometheus forever unless those series are manually deleted via the Pushgateway's API.

The latter point(*) is especially relevant when multiple instances of a job differentiate their metrics in the Pushgateway via an instance label or similar. Metrics for an instance will then remain in the Pushgateway even if the originating instance is renamed or removed. This is because the lifecycle of the Pushgateway as a metrics cache is fundamentally separate from the lifecycle of the processes that push metrics to it. Contrast this to Prometheus's usual pull-style monitoring: when an instance disappears (intentional or not), its metrics will automatically disappear along with it. When using the Pushgateway, this is not the case, and you would now have to delete any stale metrics manually or automate this lifecycle synchronization yourself.

@korengal
Copy link
Contributor

In general, we want to have the ability to use Monitored WITH Prometheus on the web, mobile apps, and lambdas. (Like we did with StatsD).

@gioragutt
Copy link
Contributor

gioragutt commented Jun 28, 2022

@korengal I figured. However, contrary to push gateway, StatsD does not have lifecycle, you fire-and-forget your metrics and don't have to think about it.

As stated above, when working with Push Gateway, you have to consider the lifecycle, since it affects how you (metrics sender) interact with it (the gateway).

I think that the next step towards implementation is mapping each of our desired use cases (web, mobile, lambdas) to understand how each would interact with the gateway.

For example, web and mobile sessions are partitioned by userId or deviceId (as an example) and can be short-lived or long-lived, and you need to make sure that old metrics are discarded over time.

Lambdas (and cronjobs and friends) are different since their lifecycle is different, they come and go, and no session is related to a different one (in terms of metrics grouping), so it requires different behavior when working with the push gateway.

Once we map those out, we can understand how to implement this, what will be the configurations exposed by the plugin (and what will be its scope in the first place), and so on.

@tomeresk
Copy link
Member

For any case where pull style collection is possible, it should be preferred (e.g. standard web servers).

For Lambdas and Cronjobs as @gioragutt mentioned, the executions are unrelated and shouldn't require any lifecycle management if the metrics are reported without ephemeral information such as machine or instance label.

For anything with a session id or some uniquely identifying label, some lifecycle management would need to be implemented on the push gateway

@mishaled
Copy link

For AWS lambdas, maybe a cloudWatch plugin will solve the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants