Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UI] Hide unhealthy stores after some time #748

Closed
metalmatze opened this issue Jan 17, 2019 · 6 comments
Closed

[UI] Hide unhealthy stores after some time #748

metalmatze opened this issue Jan 17, 2019 · 6 comments

Comments

@metalmatze
Copy link
Contributor

Thanos, Prometheus and Golang version used
This was on Kubernetes running Thanos v0.2.1.

What happened
I was looking at the Stores UI and saw a lot of unhealthy stores even though some were gone for 192+ hours.

What you expected to happen
I don't really care about stores that are reported as unhealthy after, I would say, 24 hours.

How to reproduce it (as minimally and precisely as possible):
Not actually sure how to reproduce, probably deploy a new store every now and then, but keep at least one store for persisting the gossip state?!

Full logs to relevant components

Anything else we need to know
I would be happy to take a look at this and try to fix it, if you could give me a guideline on what type of semantics the filtering should happen on.

screenshot from 2019-01-17 14-49-53

/cc @bwplotka

@adrien-f
Copy link
Member

Mmm I'm not sure about this one. I tried to mimic Prometheus' Targets page when doing this. Does it removes unhealthy targets about a while ?

If those stores are supposed to be gone, shouldn't the Query config be updated with a fresher list of stores ?

@bwplotka
Copy link
Member

bwplotka commented Jan 21, 2019

It removes, (hopefully), but seems like they are still on page.

The main problem is that we indentify those StoreAPIs by IP and it's quite wrong. As pod restarts the IP is different but it's actually same storeAPI (: maybe that's a bug?

@ipstatic
Copy link
Contributor

@bwplotka Did another PR resolve this?

@bwplotka
Copy link
Member

Sorry I should link the fix (thought it was linked before): #910

@bwplotka
Copy link
Member

It's fixed on master, yes, thanks to @adrien-f

@ipstatic
Copy link
Contributor

Great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants