-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rangefeed: Ensure Close is safe even if Start failed #110942
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
The change is verified via stress test on the unit test linked in the issue. |
@@ -228,6 +224,7 @@ func (f *RangeFeed) Start(ctx context.Context, spans []roachpb.Span) error { | |||
defer pprof.SetGoroutineLabels(ctx) | |||
ctx = pprof.WithLabels(ctx, pprof.Labels(append(f.extraPProfLabels, "rangefeed", f.name)...)) | |||
pprof.SetGoroutineLabels(ctx) | |||
f.running.Add(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still a small race here where the caller calls Close()
before the goroutine is spawned (or the counter is incremented). This is probably fine, since we'll reap the goroutine shortly anyway due to the cancelled context, but we could avoid it by incrementing this on the main goroutine and decrementing it both in the RunAsyncTask error case and when the goroutine terminates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we may have to do this, because the API contract says that it's invalid to call Add()
on a zero-valued counter after calling Wait()
.
Rangefeed Start may fail if the attempt to start async task (the rangefeed) fails due to server shutdown. If that happens, Close call would block indefinitely, waiting for the rangefeed tasks that was never started, to terminate. Fixes cockroachdb#110350 Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911 and @erikgrinaker)
pkg/kv/kvclient/rangefeed/rangefeed.go
line 227 at r1 (raw file):
Previously, erikgrinaker (Erik Grinaker) wrote…
Actually, we may have to do this, because the API contract says that it's invalid to call
Add()
on a zero-valued counter after callingWait()
.
Yeah; on the one hand, you shouldn't call Close if Start errored out; on the other hand, better safe than sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911 and @miretskiy)
pkg/kv/kvclient/rangefeed/rangefeed.go
line 227 at r1 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
Yeah; on the one hand, you shouldn't call Close if Start errored out; on the other hand, better safe than sorry.
This can race even if Start()
succeeds: the caller may call Close()
before the goroutine gets scheduled.
Bors r+ |
Build succeeded: |
Rangefeed Start may fail if the attempt to start async task (the rangefeed) fails due to server shutdown. If that happens, Close call would block indefinitely, waiting for the rangefeed tasks that was never started, to terminate.
Fixes #110350
Release note: None