-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: queue requests to push txn / resolve intents on single keys #25014
Conversation
Reviewed 5 of 5 files at r1. pkg/storage/intent_resolver.go, line 73 at r1 (raw file):
s/active/writing/g On reading the code below I thought pkg/storage/intent_resolver.go, line 92 at r1 (raw file):
The slice is ordered; document that here. pkg/storage/intent_resolver.go, line 118 at r1 (raw file):
Document the new return value (on processWriteIntentError instead of queuePushIfContended since this is the entry point from outside this file) pkg/storage/intent_resolver.go, line 162 at r1 (raw file):
This is complicated, and I think it would benefit from a more verbose description. What are the preconditions and postconditions for queuePushIfContended? What are the preconditions for calling the returned function? I have two conflicting gripes about the name. I think I think the clearest thing might be to break this out into a separate type ( pkg/storage/intent_resolver.go, line 190 at r1 (raw file):
How does this original pushee get here? Isn't pkg/storage/intent_resolver.go, line 191 at r1 (raw file):
You mean "divert future waiters"? If they're in the future, won't they find the current pusher anyway since we iterate in reverse arrival order? I think you want to say something like "if i == 0, the current pusher is the first one with a writing transaction...". But I'm not following why this special case is needed instead of just letting the current pusher be appended to the end of contendedKeys in the pkg/storage/intent_resolver.go, line 196 at r1 (raw file):
Don't mutate pkg/storage/intent_resolver.go, line 204 at r1 (raw file):
Add "on their next iteration" (right?) pkg/storage/intent_resolver.go, line 242 at r1 (raw file):
Are contended keys always resolved in FIFO order? If not, what does the dependency graph look like that results in out-of-order resolutions? pkg/storage/intent_resolver.go, line 252 at r1 (raw file):
Only one waiter will ever get a response on waitCh. This seems prone to deadlocks if we ever mess up and let two pushers wait on the same txn. We should be defensive about this and close the channel after we write to it, and check for closed status when reading from it. pkg/storage/intent_resolver.go, line 400 at r1 (raw file):
Was this necessary for the rest of the change or is it an unrelated optimization. pkg/storage/batcheval/cmd_begin_transaction.go, line 111 at r1 (raw file):
This came from @nvanbenschoten 's change, right? Probably deserves its own commit. Comments from Reviewable |
Reviewed 2 of 5 files at r1. pkg/storage/intent_resolver.go, line 74 at r1 (raw file):
Is pkg/storage/intent_resolver.go, line 162 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
a new pkg/storage/intent_resolver.go, line 185 at r1 (raw file):
We index into this map a number of times in this loop. Pull the pkg/storage/intent_resolver.go, line 204 at r1 (raw file):
Could this result in starvation of non-writing txns? pkg/storage/intent_resolver.go, line 205 at r1 (raw file):
Doesn't pkg/storage/intent_resolver.go, line 219 at r1 (raw file):
Add a comment here and rename pkg/storage/intent_resolver.go, line 225 at r1 (raw file):
We only want to wait here if the pusher is not writing? That seems to be the only case where we assign pkg/storage/intent_resolver.go, line 243 at r1 (raw file):
This indicates to me that a linked-list would be a better data structure for this queue. pkg/storage/intent_resolver.go, line 400 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
I think Spencer mentioned that it was an unrelated optimization. I like it though. We should split it off into its own change. pkg/storage/store.go, line 2786 at r1 (raw file):
pkg/storage/store.go, line 2908 at r1 (raw file):
Explain how we get in this case. pkg/storage/batcheval/cmd_begin_transaction.go, line 111 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Yeah, this is from a change I proposed a while ago. I'll pull it into its own PR and give it a nice comment. Comments from Reviewable |
Review status: all files reviewed at latest revision, 21 unresolved discussions, some commit checks failed. pkg/storage/intent_resolver.go, line 73 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/intent_resolver.go, line 74 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Done. pkg/storage/intent_resolver.go, line 92 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/intent_resolver.go, line 118 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/intent_resolver.go, line 162 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
OK, took a stab at this. PTAL. pkg/storage/intent_resolver.go, line 185 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Done. pkg/storage/intent_resolver.go, line 190 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
The original pushee is the txn which owns the intent. Before this change, all pushers of course waited on the "original pushee". This comment is explaining that even if there are already pushers queued up here, if those pushers are all non-writing, the first writing pusher to come along will still have to wait on the original pushee. This is necessary because non-writing pushers don't have actual transaction records which can be pushed instead of the original pushee. In other words, we can only daisy chain actual I updated the comment. pkg/storage/intent_resolver.go, line 191 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
That's a good way to start the explanation (incorporated). However, we must have writing transactions actually push something so we can detect dependency cycles, so we can't just have a writing transaction wait on the done channel of a non-writing transaction. pkg/storage/intent_resolver.go, line 196 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/intent_resolver.go, line 204 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
No, they actually wait for it synchronously (see below where the non-writing txns wait on the done channel of an earlier pusher). pkg/storage/intent_resolver.go, line 204 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Yes, but nothing guaranteed it before either (although in theory this could be worse except there's an relief valve that we do release one of the non-writing txns each time). It's a really thorny problem and I can't think of a good solution that doesn't result in the entire queue getting released (the workload A performance drops by 20% if we don't order the writers first). I'll put a note in the comments. pkg/storage/intent_resolver.go, line 205 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Not necessarily. pkg/storage/intent_resolver.go, line 219 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Done, but called it pkg/storage/intent_resolver.go, line 225 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
That's correct. Any writing txn must push something immediately so we can detect dependency cycles. Added a comment. pkg/storage/intent_resolver.go, line 242 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
You can have a non-writing txn and a writing txn both push the txn which owns the contended key. In that case, the non-writing txn was the original pusher, then the writing txn pusher comes along and inserts ahead of it; they resolve simultaneously in principle. So you have something like:
Also, because we can insert into the middle of the dependency graph at arbitrary times, we unfortunately need to search through the slice. Doesn't seem to be a performance issue with queue sizes in the 100s. I agree with @nvanbenschoten's suggestion to use a linked list instead. pkg/storage/intent_resolver.go, line 243 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Done. pkg/storage/intent_resolver.go, line 252 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
How do you check for closed status when reading from a channel? pkg/storage/intent_resolver.go, line 400 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
It's meaningless for the perf improvement on ycsb workload A, but does seem pretty reasonable. pkg/storage/store.go, line 2786 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Renamed. pkg/storage/store.go, line 2908 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Some other actor could come along and write the key, but not be part of the intent resolver queue yet. We could end up getting a new write intent error, and we definitely need to make sure we cleanup before becoming part of another queue. Added a comment. pkg/storage/batcheval/cmd_begin_transaction.go, line 111 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Removed; it's already been fixed. Comments from Reviewable |
7ea4b03
to
d066ef3
Compare
Reviewed 4 of 4 files at r2. pkg/storage/intent_resolver.go, line 252 at r1 (raw file): Previously, spencerkimball (Spencer Kimball) wrote…
pkg/storage/intent_resolver.go, line 83 at r2 (raw file):
s/by/be/ pkg/storage/intent_resolver.go, line 100 at r2 (raw file):
Nice explanation. pkg/storage/intent_resolver.go, line 121 at r2 (raw file):
Remove "Otherwise"; it returns these things whether it blocks or not. pkg/storage/intent_resolver.go, line 130 at r2 (raw file):
pkg/storage/intent_resolver.go, line 166 at r2 (raw file):
There is no longer an pkg/storage/intent_resolver.go, line 192 at r2 (raw file):
Why this difference? Why do we swap the wait channels whether we insert before or after pkg/storage/intent_resolver.go, line 282 at r2 (raw file):
It feels error-prone to use the nil-ness of the transaction point to mean "did you leave an intent on the key of this WriteIntentError". Maybe this function should take an extra bool to make this more explicit. pkg/storage/intent_resolver.go, line 305 at r2 (raw file):
Add a comment that we're mutating wiErr here so we may push a txn other than the one that was originally requested. pkg/storage/store.go, line 2709 at r2 (raw file):
Are false positives here a problem? I think this is currently always true, but we've talked about introducing conditional operations that may or may not write an intent on success. Or what about races? If we write an intent and it quickly gets resolved, do we just go ahead and push the finished transaction or could it get gummed up somewhere? pkg/storage/store.go, line 2831 at r2 (raw file):
This gives up our place in line. Would it be better to accumulate a list of cleanup functions to run when the request ultimately finishes, or is it important to clean up now and let other requests proceed instead of blocking them longer? Comments from Reviewable |
d066ef3
to
f2d3740
Compare
Review status: 2 of 5 files reviewed at latest revision, 15 unresolved discussions. pkg/storage/intent_resolver.go, line 252 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/intent_resolver.go, line 83 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/intent_resolver.go, line 100 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/intent_resolver.go, line 121 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/intent_resolver.go, line 130 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Might as well. Changed. pkg/storage/intent_resolver.go, line 166 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/intent_resolver.go, line 192 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
That was an error. We should only swap the channels if we insert before. pkg/storage/intent_resolver.go, line 282 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
I hear the complaint, but the bool then looks really redundant in that light. I tried something slightly different: added a parameter name to all the signatures. I think it's less likely to be misinterpreted now. pkg/storage/intent_resolver.go, line 305 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/store.go, line 2709 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
False positives are always OK by design with the pushing machinery. If for any reason the intent is not there, we'll still wait on the transaction which is not optimal, but that is guaranteed to conclude. pkg/storage/store.go, line 2831 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
I think simpler is better here. I think this is a low probability event. Accumulating a list of cleanup functions seems Comments from Reviewable |
Reviewed 3 of 3 files at r3. Comments from Reviewable |
Reviewed 2 of 4 files at r2, 2 of 3 files at r3. pkg/storage/intent_resolver.go, line 156 at r2 (raw file):
We don't need to hold the contentionQueue lock until here. Since we're performing a few allocations above, we should move pkg/storage/intent_resolver.go, line 116 at r3 (raw file):
Add a note to this method comment that pkg/storage/intent_resolver.go, line 139 at r3 (raw file):
This doesn't need to capture anything, and I think having it here and called "update" will just create confusion because it's throwing away eveything from the old error except the key, which we know is shared with the overlapping txn. I think it would be more clear to create a stand-alone function pkg/storage/intent_resolver.go, line 161 at r3 (raw file):
Make a note somewhere that
Comments from Reviewable |
02f496a
to
8a30bff
Compare
PTAL. I updated this after realizing that even if a transaction's This means all transactions which have a non-nil key must push in order to detect deadlocks. In order to make this work and still maintain the performance gains, I had to rethink things a bit. Now we have a strictly ordered FIFO queue (nice side benefit is this eliminates the starvation case which @nvanbenschoten pointed out in his review). I also avoid pushing in daisy-chained fashion, though that might still be worthwhile as an optimization. |
LGTM, although it's concerning that there's not a lot of test changes to go with this reworking. Can you post an updated version of the performance chart above? Reviewed 2 of 3 files at r4. pkg/storage/intent_resolver.go, line 84 at r4 (raw file):
This field needs documentation. pkg/storage/intent_resolver.go, line 216 at r4 (raw file):
You'll hit this sleep an unpredictable number of times (if Comments from Reviewable |
Will post updated graph. Review status: 4 of 5 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed. pkg/storage/intent_resolver.go, line 156 at r2 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Done. pkg/storage/intent_resolver.go, line 116 at r3 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Done. pkg/storage/intent_resolver.go, line 139 at r3 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This comment has been obviated by recent changes. pkg/storage/intent_resolver.go, line 161 at r3 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Not a concern any longer, as the new code keeps a strictly fifo queue and does not search backwards through the list. pkg/storage/intent_resolver.go, line 84 at r4 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. pkg/storage/intent_resolver.go, line 216 at r4 (raw file): Previously, bdarnell (Ben Darnell) wrote…
I've updated the code not to randomly wait between these things. There's now an Comments from Reviewable |
8a30bff
to
90edb41
Compare
Here's an updated graph: notice that the additional changes which were required to handle cases where deadlock could be introduced have caused our throughput with very high contention to degrade. On a positive note, I added a mechanism to delay checking for dependency cycles, which has resulted in better performance for small to high contention. |
a4604b1
to
6b1b429
Compare
Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed. Comments from Reviewable |
cc @andreimatei -- I saw some mention of the |
nit: please update the PR description when the commit message changes. The commit now says The PR Tobi is talking about is #25541, which changes when/how the I have a question, though - if a txn doesn't have a txn record (yet), can it push/be pushed? Won't both pushing and being pushed attempt to check its txn record and do something when it's not found? In fact, I believe I've recently made some check of a either the pusher's or the pushee's txn record run sooner than it used to in order to avoid some race. And a 2nd question. The commit says "I also avoid pushing in daisy-chained fashion, though that might still be worthwhile as an optimization". I'm confused about what this means - isn't the daisy chaining the whole point of this PR? Or are you saying that this PR introduces a queue of pushers, but they're all waiting on the original intent? Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed. Comments from Reviewable |
6b1b429
to
7bddd75
Compare
@andreimatei: updated PR description. Relying on the In the latest changes in the PR, no txn will ever be pushed if it hasn't laid down an intent. Still, you can encounter an intent and go to push the txn record but discover it hasn't been written yet – that results in an automatic abort. A pusher need not have a txn record, either because it's a non-transactional operation, a read-only txn without a Daisy-chaining was the original direction of this PR, but it didn't end up being very useful. I'm actually doubtful now that it would amount to any kind of worthwhile optimization, because we now delay the check for dependency cycles. This PR does introduce a FIFO queue of pushers, lined up behind a contended intent. The front of the queue will push the owner of the intent, and the others in the queue ( the ones which may have their own txn records) will also push the owner after a delay. When the pusher at the front of the queue succeeds, it re-executes its request and either leaves its own intent (the next queued pusher will then push its txn), or simply reads the intent (the next queued pusher will immediately re-execute instead of pushing). Review status: 4 of 6 files reviewed at latest revision, 4 unresolved discussions, some commit checks pending. Comments from Reviewable |
Previously, high contention on a single key would cause every thread to push the same conflicting transaction then resolve the same intent in parallel. This is inefficient as only one pusher needs to succeed, and only one resolver needs to resolve the intent, and then only one writer should proceed while the other readers/writers should in turn wait on the previous writer by pushing its transaction. This effectively serializes the conflicting reader/writers. One complication is that all pushers which may have a valid, writing transaction (i.e., `Transaction.Key != nil`), must push either the conflicting transaction or another transaction already pushing that transaction. This allows dependency cycles to be discovered.
7bddd75
to
9a54256
Compare
bors r+ |
25014: storage: queue requests to push txn / resolve intents on single keys r=spencerkimball a=spencerkimball Previously, high contention on a single key would cause every thread to push the same conflicting transaction then resolve the same intent in parallel. This is inefficient as only one pusher needs to succeed, and only one resolver needs to resolve the intent, and then only one writer should proceed while the other readers/writers should in turn wait on the previous writer by pushing its transaction. This effectively serializes the conflicting reader/writers. One complication is that all pushers which may have a valid, writing transaction (i.e., `Transaction.Key != nil`), must push either the conflicting transaction or another transaction already pushing that transaction. This allows dependency cycles to be discovered. Fixes #20448 25791: jobs: bump default progress log time to 30s r=mjibson a=mjibson The previous code allowed updates to be performed every 1s, which could cause the MVCC row to be very large causing problems with splits. We can update much more slowly by default. In the case of a small backup job, the 5% fraction threshold will allow a speedier update rate. Remove a note that's not useful anymore since the referred function can now only be used in the described safe way. See #25770. Although this change didn't fix that bug, we still think it's a good idea. Release note: None 26293: opt: enable a few distsql logictests r=RaduBerinde a=RaduBerinde - `distsql_indexjoin`: this is only a planning test. Modifying the split points and queries a bit to make the condition more restrictive and make the optimizer choose index joins. There was a single plan that was different, and the difference was minor (the old planner is emitting an unnecessary column). - `distsql_expr`: logic-only test, enabling for opt. - `distsql_scrub`: planning test; opt version commented out for now. Release note: None Co-authored-by: Spencer Kimball <[email protected]> Co-authored-by: Matt Jibson <[email protected]> Co-authored-by: Radu Berinde <[email protected]>
Build succeeded |
@spencerkimball, @nvanbenschoten, can either of you help me write a brief release note for this change? I somehow got left out of the July 2 release notes. |
How's this?
|
Yes. Thanks, @nvanbenschoten! |
Previously, high contention on a single key would cause every thread to
push the same conflicting transaction then resolve the same intent in
parallel. This is inefficient as only one pusher needs to succeed, and
only one resolver needs to resolve the intent, and then only one writer
should proceed while the other readers/writers should in turn wait on
the previous writer by pushing its transaction. This effectively
serializes the conflicting reader/writers.
One complication is that all pushers which may have a valid, writing
transaction (i.e.,
Transaction.Key != nil
), must push either theconflicting transaction or another transaction already pushing that
transaction. This allows dependency cycles to be discovered.
Fixes #20448