release-21.2: rangefeed: fix panic due to rangefeed stopper race #76827
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 1/1 commits from #76825.
/cc @cockroachdb/release
Release justification: possible node crash.
This patch fixes a race condition that could cause an
unexpected Stopped processor
panic if a rangefeed registration wasattempted while a store was stopping.
Registering a rangefeed panics if a newly created rangefeed processor is
unexpectedly stopped and the store's stopper is not quiescing. However,
the stopper has two distinct states that it transitions through:
stopping and quiescing. It's possible for the processor to fail to start
because the stopper is stopping, but before the stopper has transitioned
to quiescing, which would trigger this panic.
This patch propagates the processor startup error to the rangefeed
registration and through to the caller, returning before attempting
the registration at all and avoiding the panic. This was confirmed with
50000 stress runs of
TestPGTest/pgjdbc
, all of which succeeded.Resolves #76811.
Resolves #76767.
Resolves #76724.
Resolves #76655.
Resolves #76649.
Resolves #75129.
Resolves #64262.
Release note (bug fix): Fixed a race condition that in rare
circumstances could cause a node to panic with
unexpected Stopped processor
during shutdown.For details, see #76649 (comment).