-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: introduce fine-grained latch acquisition #39765
Conversation
I need to push up another revision here, missed a detail. Hold off on review. |
fbca30e
to
7e8aedb
Compare
Ok, done. I'm looking for guidance around the naming of the |
7e8aedb
to
0200fc0
Compare
How do you feel about just renaming the package to |
pkg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This is looking good. Adding MVCC-awareness to SpanSet
is a win on its own, and it also brings us closer to addressing #32583.
Reviewed 38 of 38 files at r1, 18 of 22 files at r2, 2 of 4 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @irfansharif, and @nvanbenschoten)
pkg/storage/spanlatch/manager.go, line 104 at r1 (raw file):
type Guard struct { done signal // latches [spanset.NumSpanScope][spanset.NumSpanAccess][]latch, but half the size.
Remove package name from comment.
pkg/storage/spanlatch/manager_test.go, line 211 at r3 (raw file):
// Attempt to acquire latches overlapping each of them. var spans SpanSet spans.Add(SpanReadOnly, roachpb.Span{Key: roachpb.Key("a")})
Why did this change?
pkg/storage/spanlatch/batch.go, line 45 at r3 (raw file):
// NewIteratorAt constructs an iterator that verifies access of the underlying // iterator against the given SpanSet at the given timestamp.
Even on its own, this is a nice change.
pkg/storage/spanlatch/spanset.go, line 69 at r3 (raw file):
} // spanLatch is used to logically represent a latch acquired over a span
It seems pretty strange to me that a SpanSet has any knowledge of "latches". I've always thought of SpanSet as a typedef around []Span
that's optionally aware of accesses and scopes. Adding some notion of timestamp awareness seems like a natural extension of this.
However, I don't think I'm sold on the need to merge the spanset
and spanlatch
packages. Even with the introduction of optional timestamps to the spanset package, it's still focused and we don't add much bloat to its package boundary. Similarly, the spanlatch package has a slim external interface and a clear focus. It depends on the spanset
package, but only through its well-defined external interface.
Merging the two together expands the scope of the combined package without much benefit. It forces us to qualify names like spanlatch.Make into spanlatch.MakeManager and moves functions like spanlatch.NewBatchAt right next to it, even though they're fairly disjoint concepts. It also prevents us from using spanset
for other purposes, which I can absolutely see us doing in the future (and I believe we have done in the past).
If the unit of abstraction in Go is a package then we're creating a new abstraction with more concerns and less of a clear purpose. Is that necessary for this change? I very well might be missing a compelling reason why we want to do this.
pkg/storage/spanlatch/spanset.go, line 126 at r3 (raw file):
if keys.IsLocal(span.Key) { scope = SpanLocal // All local spans are treated non-MVCC.
Is it worth pushing this logic in here? If a caller of this function wants MVCC local latches (I don't know why it would) then shouldn't we let it get them? I think one of the biggest benefits of this change is that it allows all of this kind of complexity to be lifted into the span declaration code itself.
pkg/storage/spanlatch/spanset.go, line 182 at r3 (raw file):
} zerots := hlc.Timestamp{}
You can use .IsEmpty()
to get rid of this.
04e09e3
to
1b7a4b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @nvanbenschoten)
pkg/storage/spanlatch/spanset.go, line 69 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
It seems pretty strange to me that a SpanSet has any knowledge of "latches". I've always thought of SpanSet as a typedef around
[]Span
that's optionally aware of accesses and scopes. Adding some notion of timestamp awareness seems like a natural extension of this.However, I don't think I'm sold on the need to merge the
spanset
andspanlatch
packages. Even with the introduction of optional timestamps to the spanset package, it's still focused and we don't add much bloat to its package boundary. Similarly, the spanlatch package has a slim external interface and a clear focus. It depends on thespanset
package, but only through its well-defined external interface.Merging the two together expands the scope of the combined package without much benefit. It forces us to qualify names like spanlatch.Make into spanlatch.MakeManager and moves functions like spanlatch.NewBatchAt right next to it, even though they're fairly disjoint concepts. It also prevents us from using
spanset
for other purposes, which I can absolutely see us doing in the future (and I believe we have done in the past).If the unit of abstraction in Go is a package then we're creating a new abstraction with more concerns and less of a clear purpose. Is that necessary for this change? I very well might be missing a compelling reason why we want to do this.
Hm, I agree. My initial thought here was merging what was the spanLatch
type as of r4 and spanset.latch
type into one, but after seeing the memory footprint during span declaration I decided against it. I don't believe spanset
presently sees usage outside latch acquisition but I'll take your word it's something we'd possibly do going forward.
Separated it back out again, and named the type to Span
, i.e. elements within spanset.SpanSet
are of type spanset.Span
.
pkg/storage/spanlatch/spanset.go, line 126 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Is it worth pushing this logic in here? If a caller of this function wants MVCC local latches (I don't know why it would) then shouldn't we let it get them? I think one of the biggest benefits of this change is that it allows all of this kind of complexity to be lifted into the span declaration code itself.
I figured that local MVCC latches made no sense, but it does seem out of place to enforce this here. Moved it back out to spanlatch.Manager
.
e9346a9
to
030a4a8
Compare
95df17e
to
793412a
Compare
(PTAL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mod the last round of comments
Reviewed 4 of 40 files at r4, 27 of 27 files at r5.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @irfansharif)
pkg/storage/spanlatch/manager.go, line 177 at r5 (raw file):
latch.ts = ss[i].Timestamp if s == spanset.SpanLocal { // All local spans are treated non-MVCC.
Do you think we need this anymore? Can we rely on the local spans already being declared as non-mvcc?
pkg/storage/spanlatch/spanset.go, line 69 at r3 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Hm, I agree. My initial thought here was merging what was the
spanLatch
type as of r4 andspanset.latch
type into one, but after seeing the memory footprint during span declaration I decided against it. I don't believespanset
presently sees usage outside latch acquisition but I'll take your word it's something we'd possibly do going forward.Separated it back out again, and named the type to
Span
, i.e. elements withinspanset.SpanSet
are of typespanset.Span
.
Thanks. This looks better to me.
pkg/storage/spanset/batch.go, line 27 at r5 (raw file):
spans *SpanSet // spansOnly controls whether or not timestamps associated with the
Did we run into issues before when we simply used ts == hlc.Timestamp{}
to signify this?
pkg/storage/spanset/spanset_test.go, line 150 at r5 (raw file):
}{ // Read access disallowed for subspan or included point at greater timestamp. // Exactly as declared at a timestamp higher than what's declared.
"or at a timestamp"?
pkg/storage/spanset/spanset_test.go, line 157 at r5 (raw file):
} for _, tc := range disallowedRO { if err := ss.CheckAllowedAt(SpanReadOnly, tc.span, tc.ts); err == nil {
here and below: we generally prefer using testutils.IsError
.
pkg/storage/spanset/spanset_test.go, line 189 at r5 (raw file):
ts hlc.Timestamp }{ // Exactly as declared at a timestamp different from than what's declared.
"from than"
793412a
to
4a8478e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @nvanbenschoten)
pkg/storage/spanlatch/manager.go, line 177 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Do you think we need this anymore? Can we rely on the local spans already being declared as non-mvcc?
yes we can!
pkg/storage/spanset/batch.go, line 27 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Did we run into issues before when we simply used
ts == hlc.Timestamp{}
to signify this?
My initial thought here was to distinguish between ensuring access only based on span boundaries vs. ensuring non-MVCC access (i.e. using spanset.NewBatchAt(..., zerots)
. These are different, if all we have is a write latch at t=2
over [a-c)
, non-MVCC writes to b
should be disallowed whereas it's fine if only looking at span boundaries. That being said, I don't foresee the latter usage, so I've removed this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first commit Do you want to move the second to a new PR so we don't hold this up?
On that note, we're currently in the code freeze period for this release, so I'm torn about whether we should merge this now or hold off for a few weeks. Do you see any benefit in getting it in? Do you think the risk is worth it?
Reviewed 1 of 40 files at r4, 7 of 7 files at r6.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @nvanbenschoten)
36f7b37
to
3c11668
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woops, you weren't suppose to see that commit. I'm ok holding off on it until we're post code freeze (prefer it actually in light of the following questions).
I have a few questions regarding what the appropriate span access timestamp would be at the following places:
- For
roachpb.Refresh{,Range}
requests, I expectroachpb.Refresh{,Range}Request.Header.Timestamp
to be the same asroachpb.Refresh{,Range}Request.Header.Txn.Timestamp
. Am I wrong? - For
QueryIntent
requests, we read athlc.MaxTimestamp
but declare access at the providedheader.Timestamp
. My understanding is that given query intent requests are only used to read intents, and come part and parcel with other requests declaring access timestamps over shared keys, it's fine to simply declare access at the givenheader.Timestamp
given we mediate access based on the other span declarations in the batch. Am I wrong? - From what was first reviewed, I changed the span declarations for
Subsume
requests to reflect our intended usage. PTAL. - From what was first reviewed, I changed the span declarations for
{Put,DeleteRange}
requests withInline
args to declare NonMVCC access if inlined. PTAL. - For end txn split triggers, we have an MVCC span declaration over the entire pre-split range at the given
header.Timestamp
. This preserves the existing behavior becausereadOnlyCmdMu
still exists. Re-working this is work for a future PR but I don't quite understand why reads at lower timestamps are unsafe. What does 'at any second the RHS can be removed' mean? (referencing storage: acquire read latches instead of write latches during range splits #32583) - For end txn merge triggers, we have an MVCC span declaration over the RHS at the given
header.Timestamp
. I believe this is what we want. Am I wrong?
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner and @nvanbenschoten)
3c11668
to
b6b91de
Compare
@irfansharif this should be ready to pick back up whenever you're ready. |
64eca7b
to
8df0f80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nvanbenschoten: I'd forgotten to ping back here last week, rebased again and ready for another (final) look.
da3d370
to
317fdbf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 19 of 19 files at r9.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @irfansharif)
317fdbf
to
79949ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 38 files at r1, 2 of 22 files at r2, 2 of 40 files at r4, 21 of 27 files at r5, 1 of 1 files at r10.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif)
pkg/storage/spanset/batch.go, line 258 at r10 (raw file):
spans *SpanSet spansOnly bool
nit: comment on the semantics of spansOnly
. Namely that it means ts
is not consulted.
pkg/storage/spanset/merge.go, line 38 at r10 (raw file):
// mergeSpans sorts the given spans and merges ones with overlapping // spans and equal access timestamps. The implementation is a copy of // roachpb.MergeSpans.
:(
I wonder if there's an adequately efficient way to refactor this to deduplicate the logic. Perhaps an extension to sort.Interface that has something like At(i int) *roachpb.Span
?
I guess one day we'll get generics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner)
pkg/storage/spanset/batch.go, line 258 at r10 (raw file):
Previously, ajwerner wrote…
nit: comment on the semantics of
spansOnly
. Namely that it meansts
is not consulted.
Already did so on type Iterator
above, added the fact that ts
is not consulted if spansOnly
is true (spansOnly + ts
is used in multiple types)
pkg/storage/spanset/merge.go, line 38 at r10 (raw file):
Previously, ajwerner wrote…
:(
I wonder if there's an adequately efficient way to refactor this to deduplicate the logic. Perhaps an extension to sort.Interface that has something like
At(i int) *roachpb.Span
?I guess one day we'll get generics?
:( indeed
79949ed
to
ac3b892
Compare
bors r+ |
(bors cancellation for some reason) bors r+ |
Towards cockroachdb#32583. The current mechanism in place for declaring latches over MVCC keys only allows them to be declared at a fixed timestamp (on a per request basis). For split commands for e.g., we're interested in in declaring latches on different timestamps for the two sides of the split. Specifically we're interested in declaring a non-MVCC read latch over the LHS and a non-MVCC write latch over the RHS to be able to service reads during splits (requests without timestamps, i.e. the empty timestamp, are considered non-MVCC). The same ideas can be applied to range merges and lease transfers. This change allows requests to specify, during span declarations, whether said span would require an MVCC or non-MVCC access. The access mode would correspond to an MVCC or non-MVCC latch acquired over said span depending. This would allow the aforementioned optimizations possible. Release note: None
ac3b892
to
15d0c2d
Compare
(Pushing a no-op amend to see if CLA bot picks it up 🔬 🐶 ) |
bors r+ |
39765: storage: introduce fine-grained latch acquisition r=irfansharif a=irfansharif Towards #32583. The current mechanism in place for declaring latches over MVCC keys only allows them to be declared at a fixed timestamp (on a per request basis). For split commands for e.g., we're interested in in declaring latches on different timestamps for the two sides of the split. Specifically we're interested in declaring a non-MVCC read latch over the LHS and a non-MVCC write latch over the RHS to be able to service reads during splits (requests without timestamps, i.e. the empty timestamp, are considered non-MVCC). The same ideas can be applied to range merges and lease transfers. This change allows callers to assign timestamps to latch acquisitions on a per latch basis so the optimizations above are made possible. Release note: None Co-authored-by: irfan sharif <[email protected]>
Build succeeded |
Fixes cockroachdb#32583. This commit uses non-MVCC latches introduced in cockroachdb#39765 to properly synchronize reads to the RHS of a split and the split itself. Specifically, this addresses the concern cockroachdb#32583 about reads below the split timestamp being able to read from the the RHS of the split even after the RHS is governed by a new timestamp cache (cockroachdb#3148) and even after it has allowed that data to be rebalanced away. These concerns had previously been addressed by using the BlockReads flag and synchronizing with reads through the readOnlyCmdMu. Now that latching gives us what we need, we can stop setting the BlockReads flag on the SplitTrigger. Release note (performance improvement): Range splits are now less disruptive to foreground reads.
Fixes cockroachdb#32583. This commit uses non-MVCC latches introduced in cockroachdb#39765 to properly synchronize reads to the RHS of a split and the split itself. Specifically, this addresses the concern cockroachdb#32583 about reads below the split timestamp being able to read from the the RHS of the split even after the RHS is governed by a new timestamp cache (cockroachdb#3148) and even after it has allowed that data to be rebalanced away. These concerns had previously been addressed by using the BlockReads flag and synchronizing with reads through the readOnlyCmdMu. Now that latching gives us what we need, we can stop setting the BlockReads flag on the SplitTrigger. Release note (performance improvement): Range splits are now less disruptive to foreground reads.
Towards #32583. The current mechanism in place for declaring latches
over MVCC keys only allows them to be declared at a fixed timestamp (on
a per request basis). For split commands for e.g., we're interested in
in declaring latches on different timestamps for the two sides of the
split. Specifically we're interested in declaring a non-MVCC read latch
over the LHS and a non-MVCC write latch over the RHS to be able to
service reads during splits (requests without timestamps, i.e. the empty
timestamp, are considered non-MVCC). The same ideas can be applied to
range merges and lease transfers.
This change allows callers to assign timestamps to latch acquisitions on
a per latch basis so the optimizations above are made possible.
Release note: None