Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: return error on locking request in LeafTxn #97817

Closed
nvanbenschoten opened this issue Feb 28, 2023 · 0 comments · Fixed by #99126
Closed

kv: return error on locking request in LeafTxn #97817

nvanbenschoten opened this issue Feb 28, 2023 · 0 comments · Fixed by #99126
Labels
A-kv-client Relating to the KV client and the KV interface. A-kv-transactions Relating to MVCC and the transactional model. C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. T-kv KV Team

Comments

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Feb 28, 2023

In #94290, we saw that locking requests could be issued through a LeafTxn and the lock spans would be silently dropped, leading to delayed unlocking once the locker transaction was committed. We don't support locking requests passed through a LeafTxn, so we should assert against this and return an error to clients that issue such requests.

Jira issue: CRDB-25806

@nvanbenschoten nvanbenschoten added C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. A-kv-client Relating to the KV client and the KV interface. A-kv-transactions Relating to MVCC and the transactional model. T-kv KV Team labels Feb 28, 2023
miraradeva added a commit to miraradeva/cockroach that referenced this issue Mar 21, 2023
Previously, as noted in cockroachdb#94290, it was possible for a LeafTxn to issue locking requests as part of SELECT FOR UPDATE. This behavior was unexpected and the RootTxn wasn't properly cleaning up the locks, resulting in others waiting for those locks to be released. The issue was resolved, in cockroachdb#94399, by ensuring non-default locking strength transactions don't use the streamer API and always run as RootTxn.

This patch adds an assertion on the kv side to prevent other existing or future attempts of LeafTxn issuing locking requests. We don't expect that there are such existing cases, so we don't expect this assertion to fail, but will keep an eye on the nightly tests to make sure.

Fixes: cockroachdb#97817
Release note: None
miraradeva added a commit to miraradeva/cockroach that referenced this issue Mar 21, 2023
Previously, as noted in cockroachdb#94290, it was possible for a LeafTxn to issue
locking requests as part of SELECT FOR UPDATE. This behavior was
unexpected and the RootTxn wasn't properly cleaning up the locks,
resulting in others waiting for those locks to be released. The issue
was resolved, in cockroachdb#94399, by ensuring non-default locking strength
transactions don't use the streamer API and always run as RootTxn.

This patch adds an assertion on the kv side to prevent other existing or
future attempts of LeafTxn issuing locking requests. We don't expect
that there are such existing cases, so we don't expect this assertion
to fail, but will keep an eye on the nightly tests to make sure.

Fixes: cockroachdb#97817 Release note: None
craig bot pushed a commit that referenced this issue Mar 22, 2023
98741: ci: update bazel builder image r=rickystewart a=cockroach-teamcity

Release note: None
Epic: None


98878: backupccl: fix occassional TestRestoreErrorPropagates flake r=stevendanna a=adityamaru

Very rarely under stress race another automatic job would race with the restore and increment the error count. This would result in the count being greater than our expected value of 1. This disables all the automatic jobs eliminating the chance of this race.

Fixes: #98037

Release note: None

99099: kvserver: deflake TestReplicaTombstone r=andrewbaptist a=tbg

Like many other tests, this test could flake because we'd sometimes
catch a "cannot remove learner while snapshot is in flight" error.

I think the root cause is that sometimes there are errant Raft snapshots
in the system[^1] and these get mistaken for LEARNERs that are still
being caught up by the replicate queue. I tried to address this general
class of issues by making the check for in-flight learner snapshots not
care about *raft* snapshots.

I was able to stress TestReplicaTombstone for 30+ minutes without a
failure using that approach, whereas previously it usually failed within
a few minutes.

```
./dev test --stress pkg/kv/kvserver/ --filter TestReplicaTombstone 2>&1 | tee stress.log
[...]
2461 runs so far, 0 failures, over 35m45s
```

[^1]: #87553

Fixes #98883.

Epic: none
Release note: None


99126: kv: return error on locking request in LeafTxn r=nvanbenschoten a=miraradeva

Previously, as noted in #94290, it was possible for a LeafTxn to issue locking requests as part of SELECT FOR UPDATE. This behavior was unexpected and the RootTxn wasn't properly cleaning up the locks, resulting in others waiting for those locks to be released. The issue was resolved, in #94399, by ensuring non-default locking strength transactions don't use the streamer API and always run as RootTxn.

This patch adds an assertion on the kv side to prevent other existing or future attempts of LeafTxn issuing locking requests. We don't expect that there are such existing cases, so we don't expect this assertion to fail, but will keep an eye on the nightly tests to make sure.

Fixes: #97817
Release note: None

99150: backupccl: stop logging unsanitized backup stmt in schedule executor r=stevendanna a=msbutler

Informs #99145

Release note: None

Co-authored-by: cockroach-teamcity <[email protected]>
Co-authored-by: adityamaru <[email protected]>
Co-authored-by: Tobias Grieger <[email protected]>
Co-authored-by: Mira Radeva <[email protected]>
Co-authored-by: Michael Butler <[email protected]>
@craig craig bot closed this as completed in 4985599 Mar 22, 2023
blathers-crl bot pushed a commit that referenced this issue Mar 23, 2023
Previously, as noted in #94290, it was possible for a LeafTxn to issue
locking requests as part of SELECT FOR UPDATE. This behavior was
unexpected and the RootTxn wasn't properly cleaning up the locks,
resulting in others waiting for those locks to be released. The issue
was resolved, in #94399, by ensuring non-default locking strength
transactions don't use the streamer API and always run as RootTxn.

This patch adds an assertion on the kv side to prevent other existing or
future attempts of LeafTxn issuing locking requests. We don't expect
that there are such existing cases, so we don't expect this assertion
to fail, but will keep an eye on the nightly tests to make sure.

Fixes: #97817 Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-client Relating to the KV client and the KV interface. A-kv-transactions Relating to MVCC and the transactional model. C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. T-kv KV Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant