Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/client: Make the lease manager thread-safe #28223

Merged
merged 1 commit into from
Aug 3, 2018

Conversation

a-robinson
Copy link
Contributor

This race was exceptionally rare due to how the lease manager is
typically used by the sqlmigrations package, but it is indeed a race.

Holding a mutex while making a remote RPC is usually a terrible idea,
but in the context of how it's used it's actually more dangerous to let
ExtendLease and ReleaseLease interleave, since if ReleaseLease's CPut
fails then the sqlmigrations package will log.Fatal, and the only
potential for lock contention is between one goroutine using ExtendLease
and one running ReleaseLseas. Perhaps this is tuning the package too
tightly to the needs of its client, but as of now it's its only client.

Fixes #28222

Release note: None

@a-robinson a-robinson requested review from andreimatei and a team August 2, 2018 22:14
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/internal/client/lease.go, line 61 at r1 (raw file):

	key roachpb.Key
	val struct {
		syncutil.Mutex

This would be better with a single-element channel semaphore instead of a mutex. That way the lock acquisition can be composed with a select on ctx.Done().

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as far as I'm concerned

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/internal/client/lease.go, line 57 at r1 (raw file):

// Lease contains the state of a lease on a particular key.
// Should only be passed by pointer, not by value.

you can't pass it by value since it now has the mutex. I'd get rid of the comment.


pkg/internal/client/lease.go, line 61 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

This would be better with a single-element channel semaphore instead of a mutex. That way the lock acquisition can be composed with a select on ctx.Done().

But where would that composition be useful? This lock is never supposed to be held for long. IOW, why did you chose this particular lock to make this implementation point?
In fact we could make the *LeaseVal an atomic.

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/internal/client/lease.go, line 61 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

But where would that composition be useful? This lock is never supposed to be held for long. IOW, why did you chose this particular lock to make this implementation point?
In fact we could make the *LeaseVal an atomic.

oh nvm, I missed where this lock was really held. Perhaps we should do something else indeed.

This race was exceptionally rare due to how the lease manager is
typically used by the sqlmigrations package, but it is indeed a race.

Holding a semaphore while making a remote RPC is usually a terrible
idea, but in the context of how it's used it's actually more dangerous
to let ExtendLease and ReleaseLease interleave, since if ReleaseLease's
CPut fails then the sqlmigrations package will log.Fatal, and the only
potential for lock contention is between one goroutine using ExtendLease
and one running ReleaseLease. Perhaps this is tuning the package too
tightly to the needs of its client, but as of now it's its only client.

Fixes cockroachdb#28222

Release note: None
Copy link
Contributor Author

@a-robinson a-robinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/internal/client/lease.go, line 57 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

you can't pass it by value since it now has the mutex. I'd get rid of the comment.

I always forget that about go. Done.


pkg/internal/client/lease.go, line 61 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

oh nvm, I missed where this lock was really held. Perhaps we should do something else indeed.

I had been relying on the calls to m.db.CPut respecting the ctx and bailing out early when it's cancelled, but you're right that this is a bit more explicit. Changed.

Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/internal/client/lease.go, line 57 at r1 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

I always forget that about go. Done.

The lint warning about copying a mutex no longer applies since it is now a channel.

Copy link
Contributor Author

@a-robinson a-robinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/internal/client/lease.go, line 57 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

The lint warning about copying a mutex no longer applies since it is now a channel.

Right, but it's also not dangerous to pass a channel by value.

@a-robinson
Copy link
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request Aug 3, 2018
28174: changefeedccl: test that the initial scan only emits the latest value r=nvanbenschoten a=danhhz

Release note: None

28223: internal/client: Make the lease manager thread-safe r=a-robinson a=a-robinson

This race was exceptionally rare due to how the lease manager is
typically used by the sqlmigrations package, but it is indeed a race.

Holding a mutex while making a remote RPC is usually a terrible idea,
but in the context of how it's used it's actually more dangerous to let
ExtendLease and ReleaseLease interleave, since if ReleaseLease's CPut
fails then the sqlmigrations package will log.Fatal, and the only
potential for lock contention is between one goroutine using ExtendLease
and one running ReleaseLseas. Perhaps this is tuning the package too
tightly to the needs of its client, but as of now it's its only client.

Fixes #28222

Release note: None

Co-authored-by: Daniel Harrison <[email protected]>
Co-authored-by: Alex Robinson <[email protected]>
@craig
Copy link
Contributor

craig bot commented Aug 3, 2018

Build succeeded

@craig craig bot merged commit c9cfe12 into cockroachdb:master Aug 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants