-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: avoid immediately launching goroutine for txn heartbeat loop #35009
Comments
Does the txnCoordSender have a defined lifecycle? The reason I ask is about when cleanup happens for the DelayedAsyncTasks. It is true that we could do it lazily based on context cancellation but it might be nice to have a mechanism to remove tasks from the delay queue eagerly. |
Ignore the above comment, I found what I was looking for |
Yeah, I was thinking we would remove |
Fixes cockroachdb#32522. This change creates a new cluster setting called: ``` kv.transaction.write_pipelining_max_outstanding_size ``` It limits the size in bytes that can be dedicated to tracking in-flight writes that have been pipelined. Once this limit is hit, not more writes will be pipelined by a transaction until some of the writes are proven and removed from the outstanding write set. This change once again illustrates the need for periodic proving of outstanding writes. We touch upon that in the type definition's comment and in cockroachdb#35009. Release note: None
Before this PR every transaction would start a goroutine for its heartbeat loop immediately upon creation. Most transactions never need to heartbeat so this goroutine is wasteful. This change delays the start of the heartbeat loop until after the first heartbeatInterval has passed by using the new DelayedStopper. Fixes cockroachdb#35009 Release note: None
Fixes cockroachdb#32522. This change creates a new cluster setting called: ``` kv.transaction.write_pipelining_max_outstanding_size ``` It limits the size in bytes that can be dedicated to tracking in-flight writes that have been pipelined. Once this limit is hit, not more writes will be pipelined by a transaction until some of the writes are proven and removed from the outstanding write set. This change once again illustrates the need for periodic proving of outstanding writes. We touch upon that in the type definition's comment and in cockroachdb#35009. Release note: None
35014: kv: introduce new "max outstanding size" setting to txnPipeliner r=nvanbenschoten a=nvanbenschoten Fixes #32522. This change creates a new cluster setting called: ``` kv.transaction.write_pipelining_max_outstanding_size ``` It limits the size in bytes that can be dedicated to tracking in-flight writes that have been pipelined. Once this limit is hit, not more writes will be pipelined by a transaction until some of the writes are proven and removed from the outstanding write set. This change once again illustrates the need for periodic proving of outstanding writes. We touch upon that in the type definition's comment and in #35009. Release note: None 35199: log: fix remaining misuse of runtime.Callers/runtime.FuncForPC r=nvanbenschoten a=nvanbenschoten Fixes #17770. This commit fixes the last user of `runtime.Callers` that misused the stdlib function by translating the PC values it returned directly into symbolic information (see https://golang.org/pkg/runtime/#Callers) [1]. Go's documentation warns that this is a recipe for disaster when mixed with mid-stack inlining. The other concern in #17770 was this comment: #17770 (comment). This was discussed in golang/go#29582 and addressed in golang/go@956879d. An alternative would be to use `runtime.Caller` here, but that would force an allocation that was hard-earned in #29017. Instead, this commit avoids any performance hit. ``` name old time/op new time/op delta Header-4 267ns ± 1% 268ns ± 0% ~ (p=0.584 n=10+10) VDepthWithVModule-4 260ns ± 3% 255ns ± 1% -1.87% (p=0.018 n=10+9) name old alloc/op new alloc/op delta Header-4 0.00B 0.00B ~ (all equal) VDepthWithVModule-4 0.00B 0.00B ~ (all equal) name old allocs/op new allocs/op delta Header-4 0.00 0.00 ~ (all equal) VDepthWithVModule-4 0.00 0.00 ~ (all equal) ``` [1] I went through and verified that this was still correct. Release note: None 35203: closedts: react faster to config changes r=danhhz a=tbg This approximately halves the duration of `make test PKG=./pkg/ccl/changefeedccl TESTS=/sinkless`, from ~60s to ~30s. Touches #34455. Release note: None Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Tobias Schottdorf <[email protected]>
We have marked this issue as stale because it has been inactive for |
still relevant |
I'm taking this off my list. |
This was addressed by #67215. |
We currently launch a goroutine immediately for any transaction that doesn't send an
EndTransaction
in the same batch as its first write. This is a little crazy because very few transactions will ever actually run long enough to send even their first heartbeat after 1s. As an non-invasive way to improve this, we could introduce a newStopper.RunDelayedAsyncTask
method which waits for some specified period of time (in a single goroutine) before launching the async task. While waiting, the goroutine coordinating this could be monitoring context cancellation to remove queued async tasks before they fire off their own goroutine.Using this approach, we could trigger a delayed async task in the heartbeater when we see the first write and delay the goroutine for an entire heartbeat interval, only paying the cost of the goroutine for transactions that will send at least one heartbeat.
An alternative to this is to create a single goroutine that handles transmitting all transaction heartbeats, but as has been discussed before, this will require protocol changes and migrations. If we're planning on taking something that large on, we should first consider alternative means of tracking transaction liveness, as we discuss here:
cockroach/pkg/kv/txn_interceptor_heartbeater.go
Lines 50 to 59 in b662cb6
Interestingly, there's already another case that would benefit from this in the
txnPipeliner
. We've always wanted to add a background job that periodically "proves" in-flight writes. Using aStopper.RunDelayedAsyncTask
method would allow us to add in this job without seeing a performance hit for short-running transactions. This would help us stay beneath the max inflight-write bytes limit we'll need to add for #32522.cc. @ajwerner. I expect that the internal mechanism here will look very similar to what you added in
requestbatcher
.Jira issue: CRDB-4614
The text was updated successfully, but these errors were encountered: