-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: resolve only intents in rangefeed's own range from txnPushAttempt #66746
kv: resolve only intents in rangefeed's own range from txnPushAttempt #66746
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @andreimatei)
pkg/kv/kvserver/rangefeed/task.go, line 256 at r1 (raw file):
// range boundary will be returned. // // NOTE: a rangefeed Processor is only configured to watch the global keyspace
Does this note explain what is happening in filtering? I think it explains why we called this function with AsRawSpanWithNoLocals(), maybe it should belong there?
pkg/roachpb/data_test.go, line 1161 at r1 (raw file):
} for _, test := range testData { require.Equal(t, test.expect, test.s1.Intersect(test.s2))
nit: Doc says that we'll get an invalid range if intersection is empty/impossible, but test actually expects to see implementation detail, not validity or the range. I'd rather have it as
if test.expect.Valid() {
require.Equal(t, test.expect, test.s1.Intersect(test.s2))
} else {
require.False(t, test.s1.Intersect(test.s2).Valid())
}
or something similar and have Span{} as expectation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me aside from the existing comments.
For other new people, I found this useful for understanding the bits about global vs local keys: https://github.com/cockroachdb/cockroach/blob/master/pkg/keys/doc.go
return Span{} | ||
} | ||
|
||
if len(s.EndKey) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: An empty end key means this span contains a single key. Overlaps
already has special code for the signle-key cases, so here we return whichever key is the single key. If they are both a single key, we know they are equal anyway so the order doesn't matter.
81c40c7
to
e7a2d1f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @aliher1911, @andreimatei, and @stevendanna)
pkg/kv/kvserver/rangefeed/task.go, line 256 at r1 (raw file):
Previously, aliher1911 (Oleg) wrote…
Does this note explain what is happening in filtering? I think it explains why we called this function with AsRawSpanWithNoLocals(), maybe it should belong there?
Good point. It wasn't mentioning the macro-level reason to do this. Done.
pkg/roachpb/data.go, line 2171 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Note to self: An empty end key means this span contains a single key.
Overlaps
already has special code for the signle-key cases, so here we return whichever key is the single key. If they are both a single key, we know they are equal anyway so the order doesn't matter.
Sounds like a useful comment too. Added.
pkg/roachpb/data_test.go, line 1161 at r1 (raw file):
Previously, aliher1911 (Oleg) wrote…
nit: Doc says that we'll get an invalid range if intersection is empty/impossible, but test actually expects to see implementation detail, not validity or the range. I'd rather have it as
if test.expect.Valid() { require.Equal(t, test.expect, test.s1.Intersect(test.s2)) } else { require.False(t, test.s1.Intersect(test.s2).Valid()) }
or something similar and have Span{} as expectation.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aliher1911, @andreimatei, @nvanbenschoten, and @stevendanna)
pkg/roachpb/data.go, line 2164 at r1 (raw file):
// Intersect returns the intersection of the key space covered by the two spans. // If there is no intersection between the twp spans, an invalid span (see Valid)
two*
e7a2d1f
to
427e840
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15, @aliher1911, @andreimatei, and @stevendanna)
pkg/roachpb/data.go, line 2164 at r1 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
two*
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 3 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15, @aliher1911, @nvanbenschoten, and @stevendanna)
pkg/kv/kvserver/rangefeed/task.go, line 228 at r2 (raw file):
}) // If the txn happens to have its LockSpans populated, then lets clean up
it'd be useful if this comment made a nod to the code above to explain why this is only an optimization; why the rangefeed was already unblocked even without resolving these intents.
Btw, is the code around here gonna change when the lock table learns how to do "virtual resolution" - will we be able to unblock the rangefeed before replicating the intent resolution?
pkg/kv/kvserver/rangefeed/task.go, line 268 at r2 (raw file):
// for a range. It is also only informed about logical operations on global keys // (see OpLoggerBatch.logLogicalOp). So even if this transaction has LockSpans // in the range's global and local keyspace, we only need to resolve those in
I think this comment about the local space should live in the caller. This function doesn't deal with local vs global keys.
pkg/kv/kvserver/rangefeed/task.go, line 270 at r2 (raw file):
// in the range's global and local keyspace, we only need to resolve those in // the global keyspace. func intentsInSpan(txn *roachpb.Transaction, bound roachpb.Span) []roachpb.LockUpdate {
nit: if you rename bound->span, it'll match the function name
pkg/kv/kvserver/rangefeed/task_test.go, line 343 at r2 (raw file):
require.Equal(t, txn4LockSpans[0], intents[2].Span) require.Equal(t, txn4LockSpans[1], intents[3].Span) require.Equal(t, func() roachpb.Span {
consider adding a nod here to the lock span truncation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15, @aliher1911, @nvanbenschoten, and @stevendanna)
pkg/roachpb/data_test.go, line 1134 at r2 (raw file):
expect Span }{ {sA, sA, sA},
Since you changed the assert below and build failed, maybe we can change expectations to Span{} for first invalid range cases as well?
Fixes cockroachdb#66741. Before this change, `(*txnPushAttempt).pushOldTxns` would resolve all of a committed or aborted transaction's intents that it discovered during a push, instead of just those in the current range. This was unnecessary for the purposes of the rangefeed, which only needs a pushed transaction's intents in its range resolved. Worse, it could result in quadratic behavior in cases where a large transaction wrote many intents across many ranges that each had an active rangefeed processor. In such cases, the rangefeed processor on each of the ranges that the transaction touched would try to resolve its intents across all ranges that the transaction touched. This could lead to a herd of redundant intent resolution, which was especially disruptive if the transaction had exceeded its precise intent span tracking and degraded to ranged intent resolution. This commit fixes this issue by having each rangefeed processor resolve only those intents that are within the bounds of its own range. This avoids the quadratic behavior that, in its worst form, could create a pileup of ranged intent resolution across an entire table and starve out foreground traffic. Release note (bug fix): Changefeeds no longer interact poorly with large, abandoned transactions. It was previously possible for this combination to result in a cascade of work during transaction cleanup that could starve out foreground traffic.
427e840
to
3e6b834
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @aayushshah15, @aliher1911, @andreimatei, and @stevendanna)
pkg/kv/kvserver/rangefeed/task.go, line 228 at r2 (raw file):
it'd be useful if this comment made a nod to the code above to explain why this is only an optimization; why the rangefeed was already unblocked even without resolving these intents.
Done.
Btw, is the code around here gonna change when the lock table learns how to do "virtual resolution" - will we be able to unblock the rangefeed before replicating the intent resolution?
It depends on what we use to drive rangefeed updates. I'd imagine that we still want to drive it with the application of intent resolution, because that's the only thing that is replicated, so it allows us to handle things identically regardless of whether a rangefeed is attached to a leaseholder or to a follower.
pkg/kv/kvserver/rangefeed/task.go, line 268 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I think this comment about the local space should live in the caller. This function doesn't deal with local vs global keys.
I considered it, but there are two callers, so I'd rather keep it centralized.
pkg/kv/kvserver/rangefeed/task.go, line 270 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
nit: if you rename bound->span, it'll match the function name
Done.
pkg/kv/kvserver/rangefeed/task_test.go, line 343 at r2 (raw file):
// truncated at beginning
I figured this was enough of a nod.
pkg/roachpb/data_test.go, line 1134 at r2 (raw file):
Previously, aliher1911 (Oleg) wrote…
Since you changed the assert below and build failed, maybe we can change expectations to Span{} for first invalid range cases as well?
The build flake wasn't due to this test, it was due to a flake in sql's TestGenerateParse
. So unrelated. This test is passing.
TFTRs! bors r+ |
Build succeeded: |
Fixes #66741.
Before this change,
(*txnPushAttempt).pushOldTxns
would resolve all ofa committed or aborted transaction's intents that it discovered during a
push, instead of just those in the current range.
This was unnecessary for the purposes of the rangefeed, which only needs
a pushed transaction's intents in its range resolved. Worse, it could
result in quadratic behavior in cases where a large transaction wrote
many intents across many ranges that each had an active rangefeed
processor. In such cases, the rangefeed processor on each of the ranges
that the transaction touched would try to resolve its intents across all
ranges that the transaction touched. This could lead to a herd of
redundant intent resolution, which was especially disruptive if the
transaction had exceeded its precise intent span tracking and degraded
to ranged intent resolution.
This commit fixes this issue by having each rangefeed processor resolve
only those intents that are within the bounds of its own range. This
avoids the quadratic behavior that, in its worst form, could create a
pileup of ranged intent resolution across an entire table and starve out
foreground traffic.
Release note (bug fix): Changefeeds no longer interact poorly with
large, abandoned transactions. It was previously possible for this
combination to result in a cascade of work during transaction cleanup
that could starve out foreground traffic.
/cc. @cockroachdb/kv