You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanos, Prometheus and Golang version used
v0.3.2-rc.0
What happened
I'm using --store.sd-files param of thanos-query for thanos-store discovery.
One of stores dies, and AFAIK right now it is only possible to note via grpc_client_handled_total metric.
Or else i should write many alerts like absent(up{monitor="store-external-label"}) == 1 for each of stores.
What you expected to happen
Would be great to have dynamic up metric, same as in prometheus, for each of stores in discovery.
And then have single usual alert up != 1
How to reproduce it (as minimally and precisely as possible):
Add --store.sd-files=sd.yaml with some fake list of stores.
Try to understand from output of thanos-query:10902/metrics that they are not available.
Full logs to relevant components
Only such events in logs:
level=warn ts=2019-03-02T17:48:11.883731563Z caller=storeset.go:308 component=storeset msg="update of store node failed" err="initial store client info fetch: rpc error: code = DeadlineExceeded desc = context deadline exceeded" address=fake.store.local:30901
Anything else we need to know
The text was updated successfully, but these errors were encountered:
The only worry is to make sure that we don't have a clash with prometheus up metric. I think we may add this metric but name it bit differently like thanos_up. Thoughts?
Correct, and then you need to keep in sync your stores list with alerts (cluster="eu1", cluster="us1" etc). Proposition is to have volatile list of stores and static alert, which need no changes when stores being changed in the list.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Thanos, Prometheus and Golang version used
v0.3.2-rc.0
What happened
I'm using
--store.sd-files
param ofthanos-query
forthanos-store
discovery.One of stores dies, and AFAIK right now it is only possible to note via
grpc_client_handled_total
metric.Or else i should write many alerts like
absent(up{monitor="store-external-label"}) == 1
for each of stores.What you expected to happen
Would be great to have dynamic
up
metric, same as in prometheus, for each of stores in discovery.And then have single usual alert
up != 1
How to reproduce it (as minimally and precisely as possible):
Add
--store.sd-files=sd.yaml
with some fake list of stores.Try to understand from output of
thanos-query:10902/metrics
that they are not available.Full logs to relevant components
Only such events in logs:
Anything else we need to know
The text was updated successfully, but these errors were encountered: