Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concurrency: explore removing the concept of reservations from the lock table #103361

Closed
arulajmani opened this issue May 16, 2023 · 0 comments · Fixed by #103478
Closed

concurrency: explore removing the concept of reservations from the lock table #103361

arulajmani opened this issue May 16, 2023 · 0 comments · Fixed by #103478
Assignees
Labels
A-kv-transactions Relating to MVCC and the transactional model. A-read-committed Related to the introduction of Read Committed C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. T-kv KV Team

Comments

@arulajmani
Copy link
Collaborator

arulajmani commented May 16, 2023

Describe the problem

In a conversation with @nvanbenschoten and @AlexTalks, we concluded that the concept of reservations could likely be removed from the lock table; instead, we could just treat the first transactional queued writer as the reservation holder (without actually naming the concept as such). Doing so would allow us to get rid of some complex state transitions that happen when trying to acquiring reservations/reservation breaking -- these involve moving requests between the queuedWriters list and the reservation, potentially back and forth. This issue serves as a placeholder to explore that idea (maybe we missed something), and if viable, to actually do so.

Additional Context

Related to Nathan's comments on #102973.

Jira issue: CRDB-27962

@arulajmani arulajmani added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. A-kv-transactions Relating to MVCC and the transactional model. T-kv KV Team A-read-committed Related to the introduction of Read Committed labels May 16, 2023
arulajmani added a commit to arulajmani/cockroach that referenced this issue May 16, 2023
This patch pulls out logic to compute which transaction requests in a
lock's wait queue are waiting on. This is forward looking to a time
where we'll support multiple lock holders on a different key. More near
term, once we get rid of reservations and replace them with the first
non-transactional inactive writer, this will serve as a single point to
make this change.

Informs: cockroachdb#103361

Release note: None
arulajmani added a commit to arulajmani/cockroach that referenced this issue May 16, 2023
This patch removes the notion of reservations from the lock table.
Reservations served as a claim that prevented multiple requests from
racing when a lock was released. Typically, when a lock was released,
only the first transactional writer was released from the list of
queued writers. It would do so by claiming a "reservation" on the
lock.

All requests that are sequenced through the lock table are associated
with a sequence number based on arrival order. These sequence numbers
are used to uphold ~fairness as requests are sequenced. They also serve
a correctness purpose -- because all locks are not known upfront (as
uncontended replicated locks may be discovered during evaluation),
sequence numbers are used to break potential deadlocks that arise from
out of order locking. This motivated the concept of reservation
breaking, which could happen if a lower sequence number request
encountered a reservation by a request with a higher sequence number.
This would lead to somewhat complex state management, where requests
could  move from being reservations to inactive waiters multiple times
during their lifetime. A lot of this can be simplified if we make no
distinction between a reservation and an inactive waiter.

This patch gets rid of reservations entirely. Instead, it offers a new
invariant:

The head of the list of waiting writers should always be an inactive,
transactional writer if the lock isn't held.

In practice, this works out functionally the same as how reservations
operated, albeit with less state transitions. This should be visible
in the datadriven test churn as part of this patch -- what was
previously a reservation is now the first inactive queued writerl at the
lock.

Closes cockroachdb#103361

Release note: None
arulajmani added a commit to arulajmani/cockroach that referenced this issue May 17, 2023
This patch pulls out logic to compute which transaction requests in a
lock's wait queue are waiting on. This is forward looking to a time
where we'll support multiple lock holders on a different key. More near
term, once we get rid of reservations and replace them with the first
transactional inactive writer, this will serve as a single point to
make this change.

Informs: cockroachdb#103361

Release note: None
@arulajmani arulajmani self-assigned this May 17, 2023
craig bot pushed a commit that referenced this issue May 18, 2023
103265: kv: resolve conflicting intents immediately for latency-sensitive requests r=nvanbenschoten a=nvanbenschoten

Fixes #50390.
Related to #60585.
Related to #103126.

Closes #64500, which was an earlier attempt to solve this issue using a similar approach. This commit addresses the comments on that PR, which focused on the pagination of intent resolution when bypassing the request batcher. We still don't try to solve this issue, and instead limit the cases where intent resolution bypasses the request batcher to those where pagination is not necessary.

This commit adds a new `sendImmediately` option to the `IntentResolver` `ResolveOptions`, which instructs the `IntentResolver` to send the provided intent resolution requests immediately, instead of adding them to a batch and waiting up to 10ms (`defaultIntentResolutionBatchWait`) for that batch to fill up with other intent resolution requests. This can be used to avoid any batching-induced latency and should be used only by foreground traffic that is willing to trade off some throughput for lower latency.

The commit then sets this flag for single-key intent resolution requests initiated by the `lockTableWaiter`. Unlike most calls into the `IntentResolver`, which are performed by background tasks that are more than happy to trade increased latency for increased throughput, the `lockTableWaiter` calls into the `IntentResolver` on behalf of a foreground operation. It wants intent resolution to complete as soon as possible, so it is willing to trade reduced throughput for reduced latency.

I tested this out by writing 10,000 different intents in a normal-priority transaction and then scanning over the table using a high-priority transaction. The test was performed on a 3-node roachprod cluster to demonstrate the effect with realistic network and disk latencies.
```sql
-- session 1
CREATE TABLE keys (k BIGINT NOT NULL PRIMARY KEY);
BEGIN; INSERT INTO keys SELECT generate_series(1, 10000);

-- session 2
BEGIN PRIORITY HIGH; SELECT count(*) FROM keys;
```

Without this change, the scan took **70.078s**. With this change, the scan took **15.958s**. This **78%** speed-up checks out. Each encountered intent is resolved serially (see #103126), so the **per-intent latency** drops from **7ms** to **1.6ms.** This improvement by about 5ms agrees with the `defaultIntentResolutionBatchIdle`, which delays each resolution request that passes through a RequestBatcher. With this change, these resolve intent requests are issued immediately and this delay is not experienced.

While this is a significant improvement and will be important for Read Committed transactions after #102014, this is still not quite enough to resolve #103126. For that, we need to batch the resolve intent requests themselves using a form of deferred intent resolution like we added in #49218 (for finalized transactions).

A similar improvement is seen for scans that encounter many abandoned intents from many different transactions. This scenario bypasses the deferred intent resolution path added in #49218, because the intents are each written by different transactions. The speedup for this scenario was presented in #64500.

Release note (performance improvement): SQL statements that must clean up intents from many different previously abandoned transactions now do so moderately more efficiently.


103362: sql: validate primary / secondary region localities at end of txn r=fqazi a=fqazi

Previously, if a database was restored with skip_localities, there was no way to modify this database to set the primary region since validation would kick in too early during the statement. This meant fixing the regions in a restored database was impossible if the primary region was no longer valid. To address this, this patch, delays locality validation till the end of the transaction.

Fixes: #103290

Release note (bug fix): SET PRIMARY REGION and SET SECONDARY REGION did not validate transactionally, which could prevent cleaning up removed regions after a restore.

103373: concurrency: small refactors in preparation for reservation removal r=nvanbenschoten a=arulajmani

See individual commits for details. 

Informs: #103361

103538: go.mod: bump Pebble to 6f2788660198, rework shared storage wrapper r=RaduBerinde a=RaduBerinde

6f278866 shared: improve interface for more efficient reading
9eb2c407 db: log events to testing.T in unit tests
f32e7dc6 db: add reserved Pebblev4 sstable format
5a6b91b8 objstorage: improve test and add read ahead test
2bc4319e objstorage: remove genericFileReadable
8143ffb9 objstorage: fix readaheadState initialization
06d08888 db: add Compact.Duration metric
e7213de0 db: add Uptime metric
e9005aed db: don't delete files during ingest application
222b43ec internal/arenaskl: fix Skiplist doc typo

Release note: None
Epic: none

Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Faizan Qazi <[email protected]>
Co-authored-by: Arul Ajmani <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
arulajmani added a commit to arulajmani/cockroach that referenced this issue May 19, 2023
This patch removes the notion of reservations from the lock table.
Reservations served as a claim that prevented multiple requests from
racing when a lock was released. Typically, when a lock was released,
only the first transactional writer was released from the list of
queued writers. It would do so by claiming a "reservation" on the
lock.

All requests that are sequenced through the lock table are associated
with a sequence number based on arrival order. These sequence numbers
are used to uphold ~fairness as requests are sequenced. They also serve
a correctness purpose -- because all locks are not known upfront (as
uncontended replicated locks may be discovered during evaluation),
sequence numbers are used to break potential deadlocks that arise from
out of order locking. This motivated the concept of reservation
breaking, which could happen if a lower sequence number request
encountered a reservation by a request with a higher sequence number.
This would lead to somewhat complex state management, where requests
could  move from being reservations to inactive waiters multiple times
during their lifetime. A lot of this can be simplified if we make no
distinction between a reservation and an inactive waiter.

This patch gets rid of reservations entirely. Instead, it offers a new
invariant:

The head of the list of waiting writers should always be an inactive,
transactional writer if the lock isn't held.

In practice, this works out functionally the same as how reservations
operated, albeit with less state transitions. Being an inactive waiter
at the head of the lock's wait-queue serves as the request's claim on
the key. As such, verbiage that referenced "reservations" previously is
now updated to talk about claims and claimant transactions. There's a
bit of comment churn as a result. There's also some datadriven test
churn as part of this patch -- but it should be helpful in convincing
ourselves that this just changes concepts, and not functionality. In
particular, what was previously a reservation holder, is now the first
inactive queued qwriter at the lock.

Closes cockroachdb#103361

Release note: None
arulajmani added a commit to arulajmani/cockroach that referenced this issue May 24, 2023
This patch removes the notion of reservations from the lock table.
Reservations served as a claim that prevented multiple requests from
racing when a lock was released. Typically, when a lock was released,
only the first transactional writer was released from the list of
queued writers. It would do so by claiming a "reservation" on the
lock.

All requests that are sequenced through the lock table are associated
with a sequence number based on arrival order. These sequence numbers
are used to uphold ~fairness as requests are sequenced. They also serve
a correctness purpose -- because all locks are not known upfront (as
uncontended replicated locks may be discovered during evaluation),
sequence numbers are used to break potential deadlocks that arise from
out of order locking. This motivated the concept of reservation
breaking, which could happen if a lower sequence number request
encountered a reservation by a request with a higher sequence number.
This would lead to somewhat complex state management, where requests
could  move from being reservations to inactive waiters multiple times
during their lifetime. A lot of this can be simplified if we make no
distinction between a reservation and an inactive waiter.

This patch gets rid of reservations entirely. Instead, it offers a new
invariant:

The head of the list of waiting writers should always be an inactive,
transactional writer if the lock isn't held.

In practice, this works out functionally the same as how reservations
operated, albeit with fewer state transitions. Being an inactive waiter
at the head of the lock's wait-queue serves as the request's claim on
the key. As such, verbiage that referenced "reservations" previously is
now updated to talk about claims and claimant transactions. There's a
bit of comment churn as a result. There's also some datadriven test
churn as part of this patch -- but it should be helpful in convincing
ourselves that this just changes concepts, and not functionality. In
particular, what was previously a reservation holder, is now the first
inactive queued writer at the lock.

Closes cockroachdb#103361

Release note: None
arulajmani added a commit to arulajmani/cockroach that referenced this issue May 24, 2023
This patch removes the notion of reservations from the lock table.
Reservations served as a claim that prevented multiple requests from
racing when a lock was released. Typically, when a lock was released,
only the first transactional writer was released from the list of
queued writers. It would do so by claiming a "reservation" on the
lock.

All requests that are sequenced through the lock table are associated
with a sequence number based on arrival order. These sequence numbers
are used to uphold ~fairness as requests are sequenced. They also serve
a correctness purpose -- because all locks are not known upfront (as
uncontended replicated locks may be discovered during evaluation),
sequence numbers are used to break potential deadlocks that arise from
out of order locking. This motivated the concept of reservation
breaking, which could happen if a lower sequence number request
encountered a reservation by a request with a higher sequence number.
This would lead to somewhat complex state management, where requests
could  move from being reservations to inactive waiters multiple times
during their lifetime. A lot of this can be simplified if we make no
distinction between a reservation and an inactive waiter.

This patch gets rid of reservations entirely. Instead, it offers a new
invariant:

The head of the list of waiting writers should always be an inactive,
transactional writer if the lock isn't held.

In practice, this works out functionally the same as how reservations
operated, albeit with fewer state transitions. Being an inactive waiter
at the head of the lock's wait-queue serves as the request's claim on
the key. As such, verbiage that referenced "reservations" previously is
now updated to talk about claims and claimant transactions. There's a
bit of comment churn as a result. There's also some datadriven test
churn as part of this patch -- but it should be helpful in convincing
ourselves that this just changes concepts, and not functionality. In
particular, what was previously a reservation holder, is now the first
inactive queued writer at the lock.

Closes cockroachdb#103361

Release note: None
craig bot pushed a commit that referenced this issue May 31, 2023
103478: concurrency: get rid of reservations in the lock table r=nvanbenschoten a=arulajmani

First 3 commits from #103373

This patch removes the notion of reservations from the lock table.
Reservations served as a claim that prevented multiple requests from
racing when a lock was released. Typically, when a lock was released,
only the first transactional writer was released from the list of
queued writers. It would do so by claiming a "reservation" on the
lock.

All requests that are sequenced through the lock table are associated
with a sequence number based on arrival order. These sequence numbers
are used to uphold ~fairness as requests are sequenced. They also serve
a correctness purpose -- because all locks are not known upfront (as
uncontended replicated locks may be discovered during evaluation),
sequence numbers are used to break potential deadlocks that arise from
out of order locking. This motivated the concept of reservation
breaking, which could happen if a lower sequence number request
encountered a reservation by a request with a higher sequence number.
This would lead to somewhat complex state management, where requests
could  move from being reservations to inactive waiters multiple times
during their lifetime. A lot of this can be simplified if we make no
distinction between a reservation and an inactive waiter.

This patch gets rid of reservations entirely. Instead, it offers a new
invariant:

The head of the list of waiting writers should always be an inactive,
transactional writer if the lock isn't held.

In practice, this works out functionally the same as how reservations
operated, albeit with less state transitions. Being an inactive waiter
at the head of the lock's wait-queue serves as the request's claim on
the key. As such, verbiage that referenced "reservations" previously is
now updated to talk about claims and claimant transactions. There's a
bit of comment churn as a result. There's also some datadriven test
churn as part of this patch -- but it should be helpful in convincing
ourselves that this just changes concepts, and not functionality. In
particular, what was previously a reservation holder, is now the first
inactive queued qwriter at the lock.

Closes #103361

Release note: None

Co-authored-by: Arul Ajmani <[email protected]>
@craig craig bot closed this as completed in ff9c357 May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-transactions Relating to MVCC and the transactional model. A-read-committed Related to the introduction of Read Committed C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. T-kv KV Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant