Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: fix conflict resolution for high-priority, non-txn'al requests #83366

Merged

Conversation

nvanbenschoten
Copy link
Member

Fixes #83342.

This commit reworks the behavior of non-transactional requests in lock
wait-queues when the lock holder or the lock queue waiter has an extreme
priority (min or max priority). In such cases, we allow the lock queue
waiter to immediately push the lock holder out of its way, either by
moving its timestamp to resolve a read-write conflict or aborting it to
resolve a write-write conflict.

This handling was broken in two ways for non-transactional requests.

  1. these requests' priorities were not consulted when deciding whether
    to immediately push instead of temporarily delaying while waiting in the
    lock wait-queue. This meant that a high-priority, non-txn request might
    still wait for 50ms (kv.lock_table.coordinator_liveness_push_delay)
    before pushing a lower priority lock holder out of its way.
  2. worse, it was possible that if these requests were not in the front
    of a lock wait-queue, they might never push. This was because we had
    logic that disabled a push if it was not needed for the purposes of
    checking liveness, detecting deadlocks, or enforcing timeouts.

This commit resolves both of these issues. It also improves the testing
of transaction priorities in the kv/kvserver/concurrency package.
Finally, it consolidates the determination of when a pusher should be
able to push/abort a pushee into a single location.

Release note (bug fix): a bug in transaction conflict resolution which
could allow backups to wait on long-running transactions has been
resolved.

@nvanbenschoten nvanbenschoten requested a review from a team as a code owner June 25, 2022 01:11
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@adityamaru
Copy link
Contributor

Thanks for the prompt fix and explanation, I'll work on a backup level test with this fix. I've given the diff a look but I'll leave the review to @AlexTalks who is more familiar with this code.

@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/exportPriority branch from 584f78e to 7bd4477 Compare July 4, 2022 21:30
Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

consider printing the push reason(s) like we were discussing

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru and @AlexTalks)

Fixes cockroachdb#83342.

This commit reworks the behavior of non-transactional requests in lock
wait-queues when the lock holder or the lock queue waiter has an extreme
priority (min or max priority). In such cases, we allow the lock queue
waiter to immediately push the lock holder out of its way, either by
moving its timestamp to resolve a read-write conflict or aborting it to
resolve a write-write conflict.

This handling was broken in two ways for non-transactional requests.
1. these requests' priorities were not consulted when deciding whether
   to immediately push instead of temporarily delaying while waiting in the
   lock wait-queue. This meant that a high-priority, non-txn request might
   still wait for 50ms (kv.lock_table.coordinator_liveness_push_delay)
   before pushing a lower priority lock holder out of its way.
2. worse, it was possible that if these requests were not in the front
   of a lock wait-queue, they might never push. This was because we had
   logic that disabled a push if it was not needed for the purposes of
   checking liveness, detecting deadlocks, or enforcing timeouts.

This commit resolves both of these issues. It also improves the testing
of transaction priorities in the `kv/kvserver/concurrency` package.
Finally, it consolidates the determination of when a pusher should be
able to push/abort a pushee into a single location.

Release note (bug fix): a bug in transaction conflict resolution which
could allow backups to wait on long-running transactions has been
resolved.
This commit adds the reason(s) for pushing a lock holder while in a lock
wait-queue to trace events.
@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/exportPriority branch from b43f7dc to f6a7a2c Compare July 5, 2022 19:24
@nvanbenschoten
Copy link
Member Author

consider printing the push reason(s) like we were discussing

Done.

TFTR!

bors r=andreimatei

@craig
Copy link
Contributor

craig bot commented Jul 5, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jul 5, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jul 5, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jul 6, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jul 6, 2022

Build succeeded:

@craig craig bot merged commit d239a2f into cockroachdb:master Jul 6, 2022
@blathers-crl
Copy link

blathers-crl bot commented Jul 6, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 83ac099 to blathers/backport-release-21.2-83366: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.2.x failed. See errors above.


error creating merge commit from 83ac099 to blathers/backport-release-22.1-83366: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

backupccl: ExportRequest with MaxUserPriority times out when it encounters an intent
4 participants