Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: compatible requests should be released when a request drops out of lock wait queue #111144

Closed
arulajmani opened this issue Sep 22, 2023 · 1 comment · Fixed by #113636
Closed
Assignees
Labels
A-kv-transactions Relating to MVCC and the transactional model. A-read-committed Related to the introduction of Read Committed C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team

Comments

@arulajmani
Copy link
Collaborator

arulajmani commented Sep 22, 2023

Describe the problem

Consider the following construction:

key A: 
- [holder: (req1: Shared, txn1)]
- wait queue: [(req2: Exclusive, txn2), (req3: Shared, txn3), (req4: Shared txn4)] 

If the exclusive locking request, req2, drops out of the wait queue (say it's session gets killed or whatever), the shared locking requests (req3 and req4) no longer need to wait -- they're compatible with the lock holder, and the request they were waiting on (req2) has dropped out. They should be allowed to proceed. This doesn't happen currently.

Jira issue: CRDB-31788

@arulajmani arulajmani added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-kv-transactions Relating to MVCC and the transactional model. T-kv KV Team A-read-committed Related to the introduction of Read Committed labels Sep 22, 2023
@nvanbenschoten nvanbenschoten self-assigned this Sep 26, 2023
@nvanbenschoten
Copy link
Member

The two pieces of logic that may be related to this are:

if !kl.isLocked() && doneRemoval {
// The first request in the queuedLockingRequests should always be an
// inactive, transactional locking request if the lock isn't held. That may
// no longer be true if the guy we removed above was serving this purpose;
// the call to maybeReleaseCompatibleLockingRequests should fix that. And if
// it wasn't serving that purpose, it'll be a no-op.
kl.maybeReleaseCompatibleLockingRequests()
}

// If the request is conflicting with a held lock then it pushes its
// holder synchronously - there is no way it will be able to proceed
// until the lock's transaction undergoes a state transition (either
// completing or being pushed) and then updates the lock's state
// through intent resolution. The request has a dependency on the
// entire conflicting transaction.
//
// However, if the request is conflicting with another request (that has
// claimed the lock, but not yet acquired it) then it pushes the
// claimant transaction asynchronously while continuing to listen to
// state transition in the lockTable. This allows the request to cancel
// its push if the conflicting claimant transaction exits the lock
// wait-queue without leaving behind a lock. In this case, the request
// has a dependency on the conflicting request but not necessarily the
// entire conflicting transaction.
if timerWaitingState.held {
return w.pushLockTxn(ctx, req, timerWaitingState)
}

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Sep 28, 2023
…tRequests

Informs cockroachdb#91545.
Informs cockroachdb#100193.

This commit expands TestLockTableConcurrentRequests to support Shared,
Exclusive, and Intent locking strength, in addition to both Replicated
and Unreplicated locking durabilities. This provides randomized coverage
of the lock table with these combinations.

The commit then temporarily disables Shared locks in the test, which
occasionally fail with the panic: `tryMakeNewDistinguished called with new claimant txn`.
This is related to cockroachdb#111144, so we can re-enable Shared locks when that issue
is resolved.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Sep 29, 2023
…tRequests

Informs cockroachdb#91545.
Informs cockroachdb#100193.

This commit expands TestLockTableConcurrentRequests to support Shared,
Exclusive, and Intent locking strength, in addition to both Replicated
and Unreplicated locking durabilities. This provides randomized coverage
of the lock table with these combinations.

The commit then temporarily disables Shared locks in the test, which
occasionally fail with the panic: `tryMakeNewDistinguished called with new claimant txn`.
This is related to cockroachdb#111144, so we can re-enable Shared locks when that issue
is resolved.

Release note: None
craig bot pushed a commit that referenced this issue Sep 29, 2023
111465: kv: add all lock strengths and durabilities to TestLockTableConcurrentRequests r=nvanbenschoten a=nvanbenschoten

Informs #91545.
Informs #100193.

This commit expands `TestLockTableConcurrentRequests` to support Shared, Exclusive, and Intent locking strength, in addition to both Replicated and Unreplicated locking durabilities. This provides randomized coverage of the lock table with these combinations.

The commit then temporarily disables Shared locks in the test, which occasionally fail with the panic: `tryMakeNewDistinguished called with new claimant txn`. This is related to #111144, so we can re-enable Shared locks when that issue is resolved.

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Oct 6, 2023
…tRequests

Informs cockroachdb#91545.
Informs cockroachdb#100193.

This commit expands TestLockTableConcurrentRequests to support Shared,
Exclusive, and Intent locking strength, in addition to both Replicated
and Unreplicated locking durabilities. This provides randomized coverage
of the lock table with these combinations.

The commit then temporarily disables Shared locks in the test, which
occasionally fail with the panic: `tryMakeNewDistinguished called with new claimant txn`.
This is related to cockroachdb#111144, so we can re-enable Shared locks when that issue
is resolved.

Release note: None
aliher1911 pushed a commit to aliher1911/cockroach that referenced this issue Oct 9, 2023
…tRequests

Informs cockroachdb#91545.
Informs cockroachdb#100193.

This commit expands TestLockTableConcurrentRequests to support Shared,
Exclusive, and Intent locking strength, in addition to both Replicated
and Unreplicated locking durabilities. This provides randomized coverage
of the lock table with these combinations.

The commit then temporarily disables Shared locks in the test, which
occasionally fail with the panic: `tryMakeNewDistinguished called with new claimant txn`.
This is related to cockroachdb#111144, so we can re-enable Shared locks when that issue
is resolved.

Release note: None
arulajmani added a commit to arulajmani/cockroach that referenced this issue Nov 1, 2023
A locking request must actively wait in a lock's wait queues if:
- it conflicts with any of the lock holders.
- or it conflicts with a lower sequence numbered request already in
the lock's wait queue.

As a result, if a locking request exits a lock's wait queue without
actually acquiring the lock, it may allow other locking requests to
proceed. This patch recomputes wait queues whenever a locking request
exits a lock's wait queues to detect such scenarios and unblock requests
which were actively waiting previously not no longer need to.

Fixes cockroachdb#111144

Release note: None
craig bot pushed a commit that referenced this issue Nov 6, 2023
112078: roachtest: clean up command-line flags r=RaduBerinde a=RaduBerinde

**This PR is only for the last commit. The rest are #111811**

---

#### roachtest: clean up command-line flags

The code around command-line flags is pretty messy: they're defined in
many places; the name and description of a flag are far away from
the variable; the variable names look like local variables and in many
cases it's not obvious we're accessing a global.

This commit moves all flags to a separate subpackage,
`roachtestflags`, making all uses of global flags obvious. We also add
a bit of infrastructure to allow defining all information about a flag
right next to where the variable is declared.

We also provide a `Changed()` function that determines if a flag value
was changed (without the caller having to use the Command or even the
flag name).

There should be no functional changes (just some cosmetic improvements
to the flag usage texts).

Epic: none
Release note: None


113245: kvserver: add metric and log when raft.Storage returns an error r=erikgrinaker a=sumeerbhola

The raft.storage.error metric is incremented on an error, and the error is logged every 30s (across all replicas).

This was motivated by a test cluster that slowed to a crawl because of deliberate data loss, but was hard to diagnose. The metric could be used for alerting, since we don't expect to see transient errors.

Informs #113053

Epic: none

Release note: None

113335: kvpb: delete ErrorDetail message r=nvanbenschoten a=nvanbenschoten

This was unused, so delete it.

The message has been unused since 0c12f6c.

Epic: None
Release note: None

113636: concurrency: recompute wait queues when locking requests drop out  r=nvanbenschoten a=arulajmani

First commit from #112732

----

A locking request must actively wait in a lock's wait queues if:
- it conflicts with any of the lock holders.
- or it conflicts with a lower sequence numbered request already in
the lock's wait queue.

As a result, if a locking request exits a lock's wait queue without
actually acquiring the lock, it may allow other locking requests to
proceed. This patch recomputes wait queues whenever a locking request
exits a lock's wait queues to detect such scenarios and unblock requests
which were actively waiting previously not no longer need to.

Fixes #111144

Release note: None

Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: sumeerbhola <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Arul Ajmani <[email protected]>
@craig craig bot closed this as completed in 515d068 Nov 6, 2023
blathers-crl bot pushed a commit that referenced this issue Nov 6, 2023
A locking request must actively wait in a lock's wait queues if:
- it conflicts with any of the lock holders.
- or it conflicts with a lower sequence numbered request already in
the lock's wait queue.

As a result, if a locking request exits a lock's wait queue without
actually acquiring the lock, it may allow other locking requests to
proceed. This patch recomputes wait queues whenever a locking request
exits a lock's wait queues to detect such scenarios and unblock requests
which were actively waiting previously not no longer need to.

Fixes #111144

Release note: None
arulajmani added a commit to arulajmani/cockroach that referenced this issue Nov 8, 2023
A locking request must actively wait in a lock's wait queues if:
- it conflicts with any of the lock holders.
- or it conflicts with a lower sequence numbered request already in
the lock's wait queue.

As a result, if a locking request exits a lock's wait queue without
actually acquiring the lock, it may allow other locking requests to
proceed. This patch recomputes wait queues whenever a locking request
exits a lock's wait queues to detect such scenarios and unblock requests
which were actively waiting previously not no longer need to.

Fixes cockroachdb#111144

Release note: None
@github-project-automation github-project-automation bot moved this to Closed in KV Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-transactions Relating to MVCC and the transactional model. A-read-committed Related to the introduction of Read Committed C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants