Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: txnPushAttempt should only resolve intents in current range #66741

Closed
nvanbenschoten opened this issue Jun 22, 2021 · 0 comments · Fixed by #66746
Closed

kv: txnPushAttempt should only resolve intents in current range #66741

nvanbenschoten opened this issue Jun 22, 2021 · 0 comments · Fixed by #66746
Assignees
Labels
A-kv-closed-timestamps Relating to closed timestamps C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-performance Perf of queries or internals. Solution not expected to change functional behavior. N-followup Needs followup. O-postmortem Originated from a Postmortem action item. T-kv KV Team

Comments

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Jun 22, 2021

Currently, (*txnPushAttempt).pushOldTxns will resolve all of a committed or aborted transaction's intents, instead of just those in the current range. This is unnecessary and can also result in quadratic behavior where many rangefeed processors can all try to resolve intents across many ranges that have been touched by a single large transaction.

This should be replaced by a more targetted intent resolution pass. Instead of resolving all of an old transaction's intents, we should only resolve those on the current range by intersecting its lock spans with the current range's boundaries.

Epic: CRDB-8282

@nvanbenschoten nvanbenschoten added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-kv-closed-timestamps Relating to closed timestamps T-kv KV Team labels Jun 22, 2021
@nvanbenschoten nvanbenschoten self-assigned this Jun 22, 2021
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jun 23, 2021
Fixes cockroachdb#66741.

Before this change, `(*txnPushAttempt).pushOldTxns` would resolve all of
a committed or aborted transaction's intents that it discovered during a
push, instead of just those in the current range.

This was unnecessary for the purposes of the rangefeed, which only needs
a pushed transaction's intents in its range resolved. Worse, it could
result in quadratic behavior in cases where a large transaction wrote
many intents across many ranges that each had an active rangefeed
processor. In such cases, the rangefeed processor on each of the ranges
that the transaction touched would try to resolve its intents across all
ranges that the transaction touched. This could lead to a herd of
redundant intent resolution, which was especially disruptive if the
transaction had exceeded its precise intent span tracking and degraded
to ranged intent resolution.

This commit fixes this issue by having each rangefeed processor resolve
only those intents that are within the bounds of its own range. This
avoids the quadratic behavior that, in its worst form, could create a
pileup of ranged intent resolution across an entire table and starve out
foreground traffic.

Release note (bug fix): Changefeeds no longer interact poorly with
large, abandoned transactions. It was previously possible for this
combination to result in a cascade of work during transaction cleanup
that could starve out foreground traffic.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jun 23, 2021
Fixes cockroachdb#66741.

Before this change, `(*txnPushAttempt).pushOldTxns` would resolve all of
a committed or aborted transaction's intents that it discovered during a
push, instead of just those in the current range.

This was unnecessary for the purposes of the rangefeed, which only needs
a pushed transaction's intents in its range resolved. Worse, it could
result in quadratic behavior in cases where a large transaction wrote
many intents across many ranges that each had an active rangefeed
processor. In such cases, the rangefeed processor on each of the ranges
that the transaction touched would try to resolve its intents across all
ranges that the transaction touched. This could lead to a herd of
redundant intent resolution, which was especially disruptive if the
transaction had exceeded its precise intent span tracking and degraded
to ranged intent resolution.

This commit fixes this issue by having each rangefeed processor resolve
only those intents that are within the bounds of its own range. This
avoids the quadratic behavior that, in its worst form, could create a
pileup of ranged intent resolution across an entire table and starve out
foreground traffic.

Release note (bug fix): Changefeeds no longer interact poorly with
large, abandoned transactions. It was previously possible for this
combination to result in a cascade of work during transaction cleanup
that could starve out foreground traffic.
craig bot pushed a commit that referenced this issue Jun 23, 2021
66733: vendor: bump Pebble to 4fcf40933159 r=itsbilal a=jbowens

311cdb51 vfs: refactor GetFreeSpace into GetDiskUsage
523e9e46 Export TeeEventListener

Release note: None

66746: kv: resolve only intents in rangefeed's own range from txnPushAttempt r=nvanbenschoten a=nvanbenschoten

Fixes #66741.

Before this change, `(*txnPushAttempt).pushOldTxns` would resolve all of
a committed or aborted transaction's intents that it discovered during a
push, instead of just those in the current range.

This was unnecessary for the purposes of the rangefeed, which only needs
a pushed transaction's intents in its range resolved. Worse, it could
result in quadratic behavior in cases where a large transaction wrote
many intents across many ranges that each had an active rangefeed
processor. In such cases, the rangefeed processor on each of the ranges
that the transaction touched would try to resolve its intents across all
ranges that the transaction touched. This could lead to a herd of
redundant intent resolution, which was especially disruptive if the
transaction had exceeded its precise intent span tracking and degraded
to ranged intent resolution.

This commit fixes this issue by having each rangefeed processor resolve
only those intents that are within the bounds of its own range. This
avoids the quadratic behavior that, in its worst form, could create a
pileup of ranged intent resolution across an entire table and starve out
foreground traffic.

Release note (bug fix): Changefeeds no longer interact poorly with
large, abandoned transactions. It was previously possible for this
combination to result in a cascade of work during transaction cleanup
that could starve out foreground traffic.

/cc. @cockroachdb/kv 

Co-authored-by: Jackson Owens <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig craig bot closed this as completed in 3e6b834 Jun 23, 2021
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jun 24, 2021
Fixes cockroachdb#66741.

Before this change, `(*txnPushAttempt).pushOldTxns` would resolve all of
a committed or aborted transaction's intents that it discovered during a
push, instead of just those in the current range.

This was unnecessary for the purposes of the rangefeed, which only needs
a pushed transaction's intents in its range resolved. Worse, it could
result in quadratic behavior in cases where a large transaction wrote
many intents across many ranges that each had an active rangefeed
processor. In such cases, the rangefeed processor on each of the ranges
that the transaction touched would try to resolve its intents across all
ranges that the transaction touched. This could lead to a herd of
redundant intent resolution, which was especially disruptive if the
transaction had exceeded its precise intent span tracking and degraded
to ranged intent resolution.

This commit fixes this issue by having each rangefeed processor resolve
only those intents that are within the bounds of its own range. This
avoids the quadratic behavior that, in its worst form, could create a
pileup of ranged intent resolution across an entire table and starve out
foreground traffic.

Release note (bug fix): Changefeeds no longer interact poorly with
large, abandoned transactions. It was previously possible for this
combination to result in a cascade of work during transaction cleanup
that could starve out foreground traffic.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jun 24, 2021
Fixes cockroachdb#66741.

Before this change, `(*txnPushAttempt).pushOldTxns` would resolve all of
a committed or aborted transaction's intents that it discovered during a
push, instead of just those in the current range.

This was unnecessary for the purposes of the rangefeed, which only needs
a pushed transaction's intents in its range resolved. Worse, it could
result in quadratic behavior in cases where a large transaction wrote
many intents across many ranges that each had an active rangefeed
processor. In such cases, the rangefeed processor on each of the ranges
that the transaction touched would try to resolve its intents across all
ranges that the transaction touched. This could lead to a herd of
redundant intent resolution, which was especially disruptive if the
transaction had exceeded its precise intent span tracking and degraded
to ranged intent resolution.

This commit fixes this issue by having each rangefeed processor resolve
only those intents that are within the bounds of its own range. This
avoids the quadratic behavior that, in its worst form, could create a
pileup of ranged intent resolution across an entire table and starve out
foreground traffic.

Release note (bug fix): Changefeeds no longer interact poorly with
large, abandoned transactions. It was previously possible for this
combination to result in a cascade of work during transaction cleanup
that could starve out foreground traffic.
@lunevalex lunevalex added N-followup Needs followup. O-postmortem Originated from a Postmortem action item. labels Jun 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-closed-timestamps Relating to closed timestamps C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-performance Perf of queries or internals. Solution not expected to change functional behavior. N-followup Needs followup. O-postmortem Originated from a Postmortem action item. T-kv KV Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants