Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.1: kv/concurrency: drop uncontended replicated lock on unreplicated upgrade #50119

Merged

Conversation

nvanbenschoten
Copy link
Member

Backport 1/1 commits from #49980.

/cc @cockroachdb/release


Fixes #49658.
Informs #9521.
Informs #49973.
Related to #49684.

This commit tweaks the lockTable's handling of lock acquisition to drop write-uncontended locks when upgraded from the Unreplicated to Replicated durability in much the same way we drop Replicated locks when first acquired. This is possible because a Replicated lock is also stored as an MVCC intent, so it does not need to also be stored in the lockTable if writers are not queuing on it. This is beneficial because it serves as a mitigation for #49973 and avoids the 99th percentile latency regression observed in #49658. Since we aren't currently great at avoiding excessive contention on limited scans when locks are in the lockTable, it's better the keep locks out of the lockTable when possible.

If any of the readers do truly contend with this lock even after their limit has been applied, they will notice during their MVCC scan and re-enter the queue (possibly recreating the lock through AddDiscoveredLock). Still, in practice this seems to work well in avoiding most of the artificial concurrency discussed in #49973. It's a bit of a hack and I am very interested in fixing this fully in the future (through an approach like #33373 or by incrementally consulting the lockTable in a lockAwareIterator), but for now, I don't see a downside to make this change.

I intend to backport this change to v20.1, as it's causing issues in one of the demos we like to run: #49658.

Release note (performance improvement): limited SELECT statements now do a better job avoiding unnecessary contention with UPDATE and SELECT FOR UPDATE statements.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Fixes cockroachdb#49658.
Informs cockroachdb#9521.
Informs cockroachdb#49973.
Related to cockroachdb#49684.

This commit tweaks the `lockTable`'s handling of lock acquisition to
drop write-uncontended locks when upgraded from the Unreplicated to
Replicated durability in much the same way we drop Replicated locks when
first acquired. This is possible because a Replicated lock is also
stored as an MVCC intent, so it does not need to also be stored in the
lockTable if writers are not queuing on it. This is beneficial because
it serves as a mitigation for cockroachdb#49973 and avoids the 99th percentile
latency regression observed in cockroachdb#49658. Since we aren't currently great
at avoiding excessive contention on limited scans when locks are in the
lockTable, it's better the keep locks out of the lockTable when
possible.

If any of the readers do truly contend with this lock even after their
limit has been applied, they will notice during their MVCC scan and
re-enter the queue (possibly recreating the lock through
AddDiscoveredLock). Still, in practice this seems to work well in
avoiding most of the artificial concurrency discussed in cockroachdb#49973. It's a
bit of a hack and I am very interested in fixing this fully in the
future (through an approach like cockroachdb#33373 or by incrementally consulting
the lockTable in a `lockAwareIterator`), but for now, I don't see a
downside to make this change.

I intend to backport this change to v20.1, as it's causing issues in one
of the demos we like to run.

Release note (performance improvement): limited SELECT statements
now do a better job avoiding unnecessary contention with UPDATE and
SELECT FOR UPDATE statements.
Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 5 of 5 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

@nvanbenschoten nvanbenschoten merged commit 0f0ecda into cockroachdb:release-20.1 Jun 12, 2020
@nvanbenschoten
Copy link
Member Author

@sumeerbhola I just tracked down a pretty bad bug with this that could cause txns to stall while running YCSB. I'll send out a patch, which we'll need to make sure lands in the same 20.1 patch release as this PR.

@nvanbenschoten nvanbenschoten deleted the backport20.1-49980 branch June 15, 2020 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants