-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvcoord: add TxnCoordSender
method to refresh read spans
#68051
Comments
This is related to #24798. |
@andreimatei So it turns out we already have an (unused) method cockroach/pkg/kv/kvclient/kvcoord/txn_coord_sender.go Lines 1276 to 1298 in efefd3d
Since the streamer library will be calling |
What you say makes sense to me.
Taking the timestamp to refresh to as an argument sounds good; if we don't need it, even better I'd say. |
I've looked this over, and from what I can tell we have the main pieces we need already in place. I think the streamer would do something like this: br, pErr := leafTxn.Send(ctx, ba)
if pErr != nil {
canRefreshTxn, refreshTxn := roachpb.CanTransactionRefresh(ctx, pErr)
if !canRefreshTxn {
return pErr.GoError()
}
rootTxn.Sender().UpdateRootWithLeafFinalState(ctx, &roachpb.LeafTxnFinalState{
Txn: *refreshTxn,
RefreshSpans: nil, // set appropriate spans
})
if err := rootTxn.ManualRefresh(); err != nil {
return err
}
} As such, I'm going to close this for now, but please reopen (or open a new issue) if you find that this isn't sufficient @yuzefovich. |
Hey @erikgrinaker, I just rebased my prototype for this on top of latest master, and |
Never mind, I found that it was refactored in #73557. |
@erikgrinaker I still need your help after all 😄 My current prototype is something like
Does this look right? What happens to the old |
Hm, I'm not sure if that's going to work. Because Lines 1135 to 1139 in 01cb847
But then cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go Lines 441 to 444 in 52c2df4
@nvanbenschoten Any thoughts on how to accomplish the above following #73557? Do we need to update |
This is interesting. I agree that br, pErr := c.Send(ctx, ba)
if pErr != nil {
canRefreshTxn, refreshTS := roachpb.TransactionRefreshTimestamp(pErr)
if !canRefreshTxn {
return pErr.GoError()
}
leafTxnFinalState, err := leafTxn.GetLeafTxnFinalState(ctx)
if err != nil {
return err
}
rootTxn.Sender().UpdateRootWithLeafFinalState(ctx, leafTxnFinalState)
if err := rootTxn.ManualRefresh(ctx); err != nil {
return err
}
} |
Is it on me to try the latest proposal from Nathan and see if it works? Or are you still thinking about how to achieve the manual refresh? |
Sorry, this fell through the cracks. I would go with Nathan's suggestion for now, and I'll look this over real quick when I have a chance. |
Yeah, I had a quick look, and I think that should work -- the leaf txn should have its |
@erikgrinaker @nvanbenschoten Apologies for having to page the context back in, but I finally got to working on this, and maybe the solution proposed by Nathan doesn't quite work (or at least I'm misusing it). In https://github.com/yuzefovich/cockroach/tree/streamer-refresh in the last two commits I'm introducing a test for this behavior as well as a fix (which is currently incomplete, but I don't think it's important). With the snippet similar to #68051 (comment), when trying to re-execute the BatchRequest, I'm getting the same injected RWUI error coming from here. It seems as if the Is it the problem with the test where I inject RWUI error and it is not reflected in the refresh spans of LeafTxnFinalState? Or should I be creating a new LeafTxn once the root txn is refreshed? |
Are you creating a new leaf transaction after the manual refresh? Once a leaf transaction has hit this error, I think it needs to be ingested into the root txn (through Also, could you try changing this line from |
No, I wasn't creating a new leaf and was using the old one. After creating a fresh LeafTxn the retried BatchRequest goes through, but it still has the old timestamp which goes against my intuition. It looks like we probably should not be just ignoring
This doesn't seem to change anything (neither w/ nor w/o re-creating a leaf txn). |
This goes against my intuition as well. I think it has to do with the We're not doing this in the test, so I think that explains things. However, I'm not sure that we do bump the write timestamp on a With that in mind, it does feel like we should be using the |
I see, I think this makes sense, thanks Nathan. It almost worked right away - I started getting
during |
I don't think that snippet is quite right. The handling of Let me see whether I can clean up this handling of |
Related to cockroachdb#68051. This is a partial reversion of d6ec977 which downgrades the role of `txnSpanRefresher.refreshedTimestamp` back to being used as a sanity check that we don't allow incoherent refresh spans into the refresh footprint. We no longer use the field to determine where to refresh from. Instead, we use the pre-refreshed BatchRequest.Txn.ReadTimestamp to determine the lower-bound of the refresh. This avoids some awkward logic in txnSpanRefresher.SendLocked (e.g. the logic needed in b9fb236). It also avoids the kinds of issues we saw when trying to expand the use of manual refreshing in cockroachdb#68051. Release note: None.
#82649 should avoid the need to hack around the empty |
Related to cockroachdb#68051. This is a partial reversion of d6ec977 which downgrades the role of `txnSpanRefresher.refreshedTimestamp` back to being used as a sanity check that we don't allow incoherent refresh spans into the refresh footprint. We no longer use the field to determine where to refresh from. Instead, we use the pre-refreshed BatchRequest.Txn.ReadTimestamp to determine the lower-bound of the refresh. This avoids some awkward logic in txnSpanRefresher.SendLocked (e.g. the logic needed in b9fb236). It also avoids the kinds of issues we saw when trying to expand the use of manual refreshing in cockroachdb#68051. Release note: None.
82649: kv: only use txnSpanRefresher.refreshedTimestamp for assertions r=nvanbenschoten a=nvanbenschoten Related to #68051. This is a partial reversion of d6ec977 which downgrades the role of `txnSpanRefresher.refreshedTimestamp` back to being used as a sanity check that we don't allow incoherent refresh spans into the refresh footprint. We no longer use the field to determine where to refresh from. Instead, we use the pre-refreshed BatchRequest.Txn.ReadTimestamp to determine the lower-bound of the refresh. This avoids some awkward logic in txnSpanRefresher.SendLocked (e.g. the logic needed in b9fb236). It also avoids the kinds of issues we saw when trying to expand the use of manual refreshing in #68051. Release note: None. Co-authored-by: Nathan VanBenschoten <[email protected]>
We have marked this issue as stale because it has been inactive for |
To implement the Index Lookups Memory Limits and Parallelism RFC, the streamer will have to use leaf transactions to run parallel requests (as a root
TxnCoordSender
does not support this).This will prevent the use of automatic span refreshing, instead propagating these errors to the client who will have to retry the transaction. We can avoid this by adding a method to the
TxnCoordSender
that, given a refresh timestamp, explicitly refreshes the read spans and handles errors as appropriate. The streamer will then coordinate in-flight requests, update the root txn with leaf state viaUpdateRootWithLeafFinalState
, and explicitly submit a refresh. The method should ensure we return appropriate info in any errors propagated to clients (i.e. the original keys that caused the refresh, as well as which keys caused the refresh to fail).For more details, see the Hiding ReadWithinUncertaintyInterval errors section in the RFC.
/cc @cockroachdb/kv
The text was updated successfully, but these errors were encountered: