Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: pessimistic-mode, replicated read locks to enable large, long running transactions #52768

Closed
ajwerner opened this issue Aug 13, 2020 · 2 comments
Labels
A-kv-transactions Relating to MVCC and the transactional model. A-schema-transactional C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) no-issue-activity T-kv KV Team X-stale

Comments

@ajwerner
Copy link
Contributor

ajwerner commented Aug 13, 2020

Is your feature request related to a problem? Please describe.

We've increasingly seen issues with unlimited retries of large, long-running transactions (see #51294, #44645). In some cases where there is no contention whatsoever, the solution to those problems has been to increase the kv.transaction.max_refresh_spans_bytes cluster settings. The work in 20.2 (#46275) to compress those spans should be helpful in these zero-contention cases, however, it may lead to false dependencies.

We've also noted that operations which have laid intents down over reads do not need to refresh those reads. This fact however has not been actionable (and thus there is no code to subtract write keys from the refresh spans) because SQL reads have always been performed as scans and not gets. This is changing soon as @helenmhe works on #46758 with a WIP at #52511.

I suspect that the above situation will be rather helpful in unbounded DELETEs read off of a secondary index with no writes in the range being deleted but writes interspersed in the primary index. In that case, the compression introduced in #46275 will prove problematic.

As we move towards an implementation of transactional schema changes, we are going to be introducing operations which will, by their very nature, will have their timestamp pushed. Furthermore, these transactions are likely to be extremely expensive to retry. However, these transactions are unlikely to be latency sensitive and thus might pair nicely with a mode that allowed reads to push them but blocked contended writes.

Describe the solution you'd like

The solution I'd like to see is a transaction mode whereby all reads acquired a replicated, durable read lock over all spans which were read. This read lock would prevent the need for reads to be refreshed when the transaction's timestamp is pushed. It might make sense to automatically switch to this mode when a transaction enters its second epoch.

Describe alternatives you've considered

The primary alternative in the context to transactional schema changes is to just accept that retries may happen in the face of contention and that one needs to deal with retries.

Additional context
The concept of ranged read locks has long been blocked on the existence of a separated lock table. This seems to be possible in the 21.1 timeframe.

Jira issue: CRDB-3922

@ajwerner ajwerner added A-kv-transactions Relating to MVCC and the transactional model. A-schema-transactional labels Aug 13, 2020
@blathers-crl
Copy link

blathers-crl bot commented Aug 13, 2020

Hi @ajwerner, I've guessed the C-ategory of your issue and suitably labeled it. Please re-label if inaccurate.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Aug 13, 2020
@ajwerner ajwerner changed the title kv: pessimistic-mode, replicated read locks to enable large, longing running transactions kv: pessimistic-mode, replicated read locks to enable large, longrunning transactions Sep 9, 2020
@ajwerner ajwerner changed the title kv: pessimistic-mode, replicated read locks to enable large, longrunning transactions kv: pessimistic-mode, replicated read locks to enable large, long running transactions Sep 9, 2020
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Apr 16, 2021
This commit adds an exponential backoff to the transaction retry loop
when it detects that a transaction has been aborted. This was observed
to prevent thrashing under heavy read-write contention on `global_read`
ranges, which are added to kvnemesis in cockroachdb#63747. These ranges have an
added propensity to cause thrashing because every write to these ranges
gets bumped to a higher timestamp, so it is currently imperative that a
transaction be able to refresh its reads after writing to a global_read
range. If other transactions continue to invalidate a read-write
transaction's reads, it may never complete and will repeatedly abort
conflicting txns after detecting deadlocks. This commit prevents this
from stalling kvnemesis indefinitely.

I see two ways that we can improve this situation in the future.
1. The first option is that we could introduce some form of pessimistic
   read-locking for long running read-write transactions, so that they can
   eventually prevent other transactions from invalidating their reads as
   they proceed to write to a global_reads range and get their write
   timestamp bumped. This ensures that when the long-running transaction
   returns to refresh (if it even needs to, depending on the durability of
   the read locks) its reads, the refresh will have a high likelihood of
   succeeding. This is discussed in cockroachdb#52768.
2. The second option is to allow a transaction to re-write its existing
   intents in new epochs without being bumped by the closed timestamp. If a
   transaction only got bumped by the closed timestamp when writing new
   intents, then after a transaction was forced to retry, it would have a
   high likelihood of succeeding on its second epoch as long as it didn't
   write to a new set of keys. This is discussed in cockroachdb#63796.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Apr 23, 2021
This commit adds an exponential backoff to the transaction retry loop
when it detects that a transaction has been aborted. This was observed
to prevent thrashing under heavy read-write contention on `global_read`
ranges, which are added to kvnemesis in cockroachdb#63747. These ranges have an
added propensity to cause thrashing because every write to these ranges
gets bumped to a higher timestamp, so it is currently imperative that a
transaction be able to refresh its reads after writing to a global_read
range. If other transactions continue to invalidate a read-write
transaction's reads, it may never complete and will repeatedly abort
conflicting txns after detecting deadlocks. This commit prevents this
from stalling kvnemesis indefinitely.

I see two ways that we can improve this situation in the future.
1. The first option is that we could introduce some form of pessimistic
   read-locking for long running read-write transactions, so that they can
   eventually prevent other transactions from invalidating their reads as
   they proceed to write to a global_reads range and get their write
   timestamp bumped. This ensures that when the long-running transaction
   returns to refresh (if it even needs to, depending on the durability of
   the read locks) its reads, the refresh will have a high likelihood of
   succeeding. This is discussed in cockroachdb#52768.
2. The second option is to allow a transaction to re-write its existing
   intents in new epochs without being bumped by the closed timestamp. If a
   transaction only got bumped by the closed timestamp when writing new
   intents, then after a transaction was forced to retry, it would have a
   high likelihood of succeeding on its second epoch as long as it didn't
   write to a new set of keys. This is discussed in cockroachdb#63796.
craig bot pushed a commit that referenced this issue Apr 23, 2021
63799: kvnemesis: add backoff to retry loop on txn aborts r=nvanbenschoten a=nvanbenschoten

This commit adds an exponential backoff to the transaction retry loop
when it detects that a transaction has been aborted. This was observed
to prevent thrashing under heavy read-write contention on `global_read`
ranges, which are added to kvnemesis in #63747. These ranges have an
added propensity to cause thrashing because every write to these ranges
gets bumped to a higher timestamp, so it is currently imperative that a
transaction be able to refresh its reads after writing to a global_read
range. If other transactions continue to invalidate a read-write
transaction's reads, it may never complete and will repeatedly abort
conflicting txns after detecting deadlocks. This commit prevents this
from stalling kvnemesis indefinitely.

I see two ways that we can improve this situation in the future.
1. The first option is that we could introduce some form of pessimistic
   read-locking for long running read-write transactions, so that they can
   eventually prevent other transactions from invalidating their reads as
   they proceed to write to a global_reads range and get their write
   timestamp bumped. This ensures that when the long-running transaction
   returns to refresh (if it even needs to, depending on the durability of
   the read locks) its reads, the refresh will have a high likelihood of
   succeeding. This is discussed in #52768.
2. The second option is to allow a transaction to re-write its existing
   intents in new epochs without being bumped by the closed timestamp. If a
   transaction only got bumped by the closed timestamp when writing new
   intents, then after a transaction was forced to retry, it would have a
   high likelihood of succeeding on its second epoch as long as it didn't
   write to a new set of keys. This is discussed in #63796.

Co-authored-by: Nathan VanBenschoten <[email protected]>
@jlinder jlinder added the T-kv KV Team label Jun 16, 2021
@github-actions
Copy link

github-actions bot commented Sep 7, 2023

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 18, 2023
@github-project-automation github-project-automation bot moved this to Closed in KV Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-transactions Relating to MVCC and the transactional model. A-schema-transactional C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) no-issue-activity T-kv KV Team X-stale
Projects
No open projects
Archived in project
Development

No branches or pull requests

2 participants