Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gossip, server, ui: Node liveness records from dead nodes never time out #15609

Closed
a-robinson opened this issue May 2, 2017 · 8 comments
Closed
Assignees
Milestone

Comments

@a-robinson
Copy link
Contributor

@tamird mentioned this recently but doesn't appear to have filed an issue for it. There's no reason for us to put such a long TTL on node liveness records and/or not remove them for dead nodes. Not removing it is causing node 7 to show up on blue's debug/problemranges page even though node 7 was removed from the cluster days ago:

screen shot 2017-05-02 at 2 50 00 pm

@a-robinson a-robinson added this to the 1.1 milestone May 2, 2017
@tamird tamird self-assigned this May 4, 2017
@knz knz added the in progress label May 4, 2017
@a-robinson
Copy link
Contributor Author

I haven't confirmed, but this might also be aggravating #15342

@knz knz removed the in progress label May 6, 2017
@a-robinson
Copy link
Contributor Author

@tamird I assume you aren't actually working on this? It's definitely annoying, but we'll probably have to push it to 1.2.

@tamird
Copy link
Contributor

tamird commented Aug 2, 2017 via email

@benesch
Copy link
Contributor

benesch commented Aug 11, 2017

This also impacts decommissioned nodes, which display in the admin UI permanently after decommission as of #17553:

screenshot 2017-08-09 11 23 24

@tbg tbg changed the title gossip: Node liveness records from dead nodes never time out gossip, server, ui: Node liveness records from dead nodes never time out Dec 12, 2017
@tbg
Copy link
Member

tbg commented Dec 12, 2017

How do we feel about making it such that dead && decommissioned excludes the node in all listings?

If the node actually still exists, restarting it will make it non-dead, and then it can be recommissioned.

A node that just dies for a few days usually won't be decommissioned (unless the owner intents it to not come back, in which case we do actually want to hide it).

Am I missing caveats here?

@a-robinson
Copy link
Contributor Author

That seems to be what most people want. The only caveat I'm aware of is that (like cockroaches) getting rid of them has proven to be hard -- they just keep reappearing after we think we've killed the last of them.

@dianasaur323
Copy link
Contributor

@vivekmenezes could we assign someone to take on this decommissioning node issue? We won't have time to tackle it on the admin UI team.

@tbg
Copy link
Member

tbg commented Mar 14, 2018

I think this is mostly done, @m-schneider to confirm and close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants