-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extend span stats #98490
extend span stats #98490
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really cool. Nice work on this Thomas!
I've introduced an arbitrary value for the spanRequestLimit at 1000. I'm not sure if this is excessive or what an appropriate value would be (should this be a cluster setting or a const?). Additionally, it feels like the RPC should be the one that enforces this limit, I've thought of a couple options but not sure which would be preferred (alternative options also welcome):
Perhaps its worth running some timing tests with different levels of spanRequestLimit
while a cluster is under some sustained load. A non-public cluster setting could make tuning it easier. I agree that 1000 is arbitrary, but I can't make an informed guess without seeing some timings as a function of spanRequestLimit
. I don't think it needs to be a perfect science, but I'd be interested in seeing the effect of different orders of magnitude. I'm thinking 1e2 to 1e5 🤷
Enforce a strict limit. The RPC checks if the number of spans provided in the request exceeds the limit, if so, return with an error
Since the request has a payload (spans), I think it's just way simpler to force the caller to limit the size of their request payload.
I've introduced an arbitrary value for rangeStatsBatchSize at 100. Again, not sure if this is excessive or what an appropriate value would be (again, should this also be a cluster setting?).
I'd suggest the same approach as above.
I'm getting an error with invalid span keys, this is what I see from my prints:
I'll need to take a closer look at this
Tentatively, I've mentioned this before, but what do people think about including node ids and regions from this endpoint?
I don't love it. To be honest, I would defer to others on this.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @j82w and @THardy98)
pkg/server/span_stats_server.go
line 209 at r1 (raw file):
} // If we still have some remaining ranges, request range stats for the current batch. if len(fullyContainedKeysBatch) > 0 {
I think it would be good to add a flushBatchedContainedKeys
function or something
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tentatively, I've mentioned this before, but what do people think about including node ids and regions from this endpoint? I understand there is hesitancy to expose non-tenant-level ideas like node ids, but the alternative we're using at the moment to surface table node IDs/regions on the console is replicas and replica_localitieson theranges_no_leasestable (which doesn't actually give us node IDs), and we're already doing the work to gather this information innodeIDsAndRangeCountForSpan`.
I would keep the payloads separate since node-level info is gated via tenant capabilities.
In general, I think this change should be vetted from the perspective of upgrades and mixed-version clusters. I'm not entirely sure what our goals are in terms of keeping all the builtins functioning as expected etc. We should either:
- Retain backwards compatible payloads and process using new code when the
spans
array is nonempty in the request - OR Add new endpoints that do things in a new way, and cut over to using them in a separate version.
Reviewed 4 of 10 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @j82w and @THardy98)
pkg/roachpb/span_stats.proto
line 27 at r1 (raw file):
string node_id = 1 [(gogoproto.customname) = "NodeID"]; repeated Span spans = 4 [(gogoproto.nullable) = false]; reserved 2, 3;
consider keeping start_key
and end_key
here and keeping the handler backwards compatible.
pkg/rpc/auth_tenant.go
line 202 at r1 (raw file):
tenID roachpb.TenantID, args *roachpb.SpanStatsRequest, ) error { var err error
nit: don't need this var
09a0df1
to
5a79465
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some handling in the RPC to cover mixed-version cases.
From my understanding, there are effectively 2 cases we need to consider:
- When the version of the gateway node is < 23.1 and fans out to a node >= 23.1
- in this case, the v23.1 node needs to recognize that the request payload is limited to a single span, populated using the start/end key proto fields. Additionally, the response payload is again limited to a single span populated using the old response fields (
range_count
,approximate_disk_bytes
,total_stats
), instead of the newspan_to_stats
mapping.
- in this case, the v23.1 node needs to recognize that the request payload is limited to a single span, populated using the start/end key proto fields. Additionally, the response payload is again limited to a single span populated using the old response fields (
- When the version of the gateway node is >= 23.1 and fans out to a node < 23.1
- in this case, the v23.1 node needs to ensure that it's passing a backwards-compatible request payload to nodes < 23.1. This is done by passing the old request payload using the start/end key proto fields for each span that lives on the node. Additionally, the v23.1 node needs to recognize that the response payload from the < 23.1 nodes will be for a single span using the old response fields, and aggregate them into the final response accordingly.
Added SpanStatsBatchLimit
and RangeStatsBatchLimit
cluster settings to roachpb.span_stats.go
. Callers of SpanStats
are expected to limit the size of their payload based on the SpanStatsBatchLimit
cluster setting. I wanted to add them to span_stats_server.go
but callers outside the server
package are likely to run into import cycles. Adding them to the roachpb
package mitigates these import cycles issues.
regarding the key ordering bug. This looks like a legit bug and needs to be investigated.
@knz @zachlite I am able to reproduce this error on this PR reliably creating 200 tables using generate_test_objects
under a single database and calling crdb_internal.tenant_span_stats(database's id)
. I've noticed that I cannot reproduce this on master
with the same flow, even when generating more tables (i.e. >200), so it must be something introduced in this PR. Could it be due to conversions between Span
and RSpan
?
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dhartunian, @j82w, and @zachlite)
pkg/roachpb/span_stats.proto
line 27 at r1 (raw file):
Previously, dhartunian (David Hartunian) wrote…
consider keeping
start_key
andend_key
here and keeping the handler backwards compatible.
I've reinstated all formerly removed proto
fields for backwards compatibility.
pkg/rpc/auth_tenant.go
line 202 at r1 (raw file):
Previously, dhartunian (David Hartunian) wrote…
nit: don't need this var
Removed.
pkg/server/span_stats_server.go
line 209 at r1 (raw file):
Previously, zachlite wrote…
I think it would be good to add a
flushBatchedContainedKeys
function or something
Done.
5a79465
to
8fa20cb
Compare
@THardy98 I had a general question: is the intent to get this in for 23.1 or are we targeting post 23.1? |
I would like to get this in for 23.1 but will be tight. |
8fa20cb
to
07e2e3a
Compare
14d281e
to
a7ff80c
Compare
RFAL. I am still apprehensive with the key ordering bug mentioned above and in the PR desc. |
a7ff80c
to
7657a73
Compare
Extends: cockroachdb#96223 This PR extends the implementation of our SpanStats RPC endpoint to fetch stats for multiple spans at once. By extending the endpoint, we amortize the cost of the RPC's node fanout across all requested spans, whereas previously, we were issuing a fanout per span requested. Additionally, this change batches KV layer requests for ranges fully contained by the span, instead of issuing a request per fully contained range. Release note: None https://cockroachlabs.atlassian.net/browse/DOC-1355 #Informs: cockroachdb#33316 #Epic: CRDB-8035
7657a73
to
f0c2ffb
Compare
Closed in favour of: #101378 |
Extends: #96223
This PR extends the implementation of our
SpanStats
RPC endpoint to fetch stats for multiple spans at once. By extending the endpoint, we amortize the cost of the RPC's node fanout across all requested spans, whereas previously, we were issuing a fanout per span requested. Additionally, this change batches KV layer requests for ranges fully contained by the span, instead of issuing a request per fully contained range.Changes were made to the
span_stats.proto
to accommodate these changes, namely:spans
field in theSpanStatsRequest
, to allow for multiple spansspan_to_stats
field in theSpanStatsResponse
, to allow for multiple span statistics responses, indexed by their corresponding span stringNo protobuf fields were removed and the RPC logic handles both old and new request/response formats in the case of a mixed-version cluster. Requests initiated from a 23.1 node expect the request to use the new format (i.e. use
spans
instead ofStartKey
/EndKey
), but will handle fanout calls from a 22.2 node using the old format.This change introduces two cluster settings:
server.span_stats.span_batch_limit
: the maximum number of spans allowed in a request payload for span statistics, default value of 500server.span_stats.range_batch_limit
: the maximum batch size when fetching ranges statistics for a span, default value of 100Their defaults are conservative from my testing, but can be modified in need be.
Here are some benchmark results from running:
BENCHTIMEOUT=72h PKG=./pkg/server BENCHES=BenchmarkSpanStats ./scripts/bench HEAD^ HEAD
note that HEAD is actually a temp change to revert to old logic (request per span) and HEAD^ is the new logic (multi-span request)
Some notable improvements particularly with requests for spans with fewer ranges. After a point, the raw number of ranges becomes the bottleneck, despite reducing the number of fanouts. Not sure if there is a better way to fetch range statistics but I think the improvement here is enough for this PR. If improvements for fetching range statistics are identified, they can be done in a follow up PR and backported.
I've added
mixed_version_tenant_span_stats
to test on a mixed-version cluster, but this only tests when the gateway node is 23.1 (as thetenant_span_stats
builtin only existed on 23.1, and prior to that there was no way to get span stats via SQL that I know of). My thinking is theTestLocalSpanStats
unit test covers the case where we receive old format requests coming from a 22.2 node.NOTE FOR REVIEWERS
generate_test_objects