release-21.2: gossip: provide online method to clear leaked gossip infos #85776
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 1/1 commits from #85505.
/cc @cockroachdb/release
Fixes #85013.
Needed (in v21.2.X) for cockroachlabs/support#1709.
This commit introduces a new
crdb_internal.unsafe_clear_gossip_info
builtinfunction which allows admin users to manually clear info objects from the
cluster's gossip network. The function does so by re-gossiping an identical
value for the specified key but with a TTL that is long enough to reasonably
ensure full propagation to all nodes in the cluster but short enough to expire
quickly once propagated.
The function is best-effort. It is possible for the info object with the low
TTL to fail to reach full propagation before reaching its TTL. For instance,
this is possible during a transient network partition. The effect of this is
that the existing gossip info object with a higher (or no) TTL would remain
in the gossip network on some nodes and may eventually propagate back out to
other nodes once the partition heals.
@knz: I'm assigning this to you for a review both because you're as good a
person as any to look at gossip-related changes, and because limited SQL
access to the cluster's gossip network is a nuanced subject in the
context of multi-tenancy.
Release note: None
Release justification: Useful tool to clean up leaked gossip information.