-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gossip, server, ui: Node liveness records from dead nodes never time out #15609
Comments
I haven't confirmed, but this might also be aggravating #15342 |
@tamird I assume you aren't actually working on this? It's definitely annoying, but we'll probably have to push it to 1.2. |
Correct.
…On Wed, Aug 2, 2017 at 11:14 AM, Alex Robinson ***@***.***> wrote:
@tamird <https://github.com/tamird> I assume you aren't actually working
on this? It's definitely annoying, but we'll probably have to push it to
1.2.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15609 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABdsPFuJoW9CEsoz9GKFLGYFfs-RgC-lks5sUJJrgaJpZM4NOiWJ>
.
|
This also impacts decommissioned nodes, which display in the admin UI permanently after decommission as of #17553: |
How do we feel about making it such that If the node actually still exists, restarting it will make it non-dead, and then it can be recommissioned. A node that just dies for a few days usually won't be decommissioned (unless the owner intents it to not come back, in which case we do actually want to hide it). Am I missing caveats here? |
That seems to be what most people want. The only caveat I'm aware of is that (like cockroaches) getting rid of them has proven to be hard -- they just keep reappearing after we think we've killed the last of them. |
@vivekmenezes could we assign someone to take on this decommissioning node issue? We won't have time to tackle it on the admin UI team. |
I think this is mostly done, @m-schneider to confirm and close. |
@tamird mentioned this recently but doesn't appear to have filed an issue for it. There's no reason for us to put such a long TTL on node liveness records and/or not remove them for dead nodes. Not removing it is causing node 7 to show up on blue's
debug/problemranges
page even though node 7 was removed from the cluster days ago:The text was updated successfully, but these errors were encountered: