Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
101378: server: support multi-span statistics endpoint r=THardy98 a=THardy98 Epic: None Extends: #96223 This PR extends the implementation of our SpanStats RPC endpoint to fetch stats for multiple spans at once. By extending the endpoint, we amortize the cost of the RPC's node fanout across all requested spans, whereas previously, we were issuing a fanout per span requested. Additionally, this change batches KV layer requests for ranges fully contained by the span, instead of issuing a request per fully contained range. Note that we do not deprecate the `start_key` and `end_key` fields as they're used to determine whether an old node is calling out to a node using the new proto format. The changes here explicitly do not support mixed-version clusters. ---- **BENCHMARK RESULTS** Here are some benchmark results from running: ``` BENCHTIMEOUT=72h PKG=./pkg/server BENCHES=BenchmarkSpanStats ./scripts/bench HEAD^ HEAD ``` **Note** that `HEAD` is actually a temp change to revert to old logic (request per span) and `HEAD^` is the new logic (multi-span request). As such the _increases_ in latency/memory are actually _reductions_. ``` name old time/op new time/op delta SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_10_spans_with_25_ranges_each-24 10.3ms ± 2% 24.5ms ± 2% +137.38% (p=0.000 n=10+10) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_10_spans_with_50_ranges_each-24 17.1ms ± 2% 31.3ms ± 1% +83.29% (p=0.000 n=10+10) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_10_spans_with_100_ranges_each-24 30.5ms ± 2% 102.7ms ± 3% +236.55% (p=0.000 n=10+10) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_100_spans_with_25_ranges_each-24 1.75s ± 5% 2.10s ± 2% +19.89% (p=0.000 n=10+8) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_100_spans_with_50_ranges_each-24 3.00s ± 1% 3.43s ± 1% +14.35% (p=0.000 n=8+9) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_100_spans_with_100_ranges_each-24 5.01s ± 1% 5.53s ± 1% +10.44% (p=0.000 n=9+9) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_200_spans_with_25_ranges_each-24 9.66s ± 1% 10.63s ± 1% +10.10% (p=0.000 n=9+9) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_200_spans_with_50_ranges_each-24 15.2s ± 1% 16.2s ± 0% +6.61% (p=0.000 n=9+9) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_200_spans_with_100_ranges_each-24 17.4s ± 1% 18.6s ± 1% +7.31% (p=0.000 n=9+9) name old alloc/op new alloc/op delta SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_10_spans_with_25_ranges_each-24 3.91MB ± 2% 18.55MB ± 1% +374.43% (p=0.000 n=9+9) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_10_spans_with_50_ranges_each-24 6.95MB ± 2% 21.18MB ± 1% +204.85% (p=0.000 n=8+8) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_10_spans_with_100_ranges_each-24 13.3MB ± 1% 134.6MB ± 1% +912.92% (p=0.000 n=8+8) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_100_spans_with_25_ranges_each-24 1.99GB ± 4% 2.27GB ± 4% +14.11% (p=0.000 n=8+9) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_100_spans_with_50_ranges_each-24 4.16GB ± 2% 4.43GB ± 3% +6.57% (p=0.000 n=9+10) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_100_spans_with_100_ranges_each-24 7.50GB ± 1% 7.75GB ± 1% +3.27% (p=0.000 n=10+10) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_200_spans_with_25_ranges_each-24 11.8GB ± 0% 12.4GB ± 0% +4.70% (p=0.000 n=7+9) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_200_spans_with_50_ranges_each-24 21.1GB ± 2% 21.6GB ± 1% +2.70% (p=0.000 n=10+10) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_200_spans_with_100_ranges_each-24 25.8GB ± 0% 26.4GB ± 0% +2.29% (p=0.000 n=8+10) name old allocs/op new allocs/op delta SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_10_spans_with_25_ranges_each-24 26.9k ± 0% 90.1k ± 2% +235.04% (p=0.000 n=9+9) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_10_spans_with_50_ranges_each-24 51.8k ± 3% 114.9k ± 1% +121.89% (p=0.000 n=8+8) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_10_spans_with_100_ranges_each-24 106k ± 3% 1426k ± 1% +1240.14% (p=0.000 n=8+8) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_100_spans_with_25_ranges_each-24 23.2M ± 5% 23.9M ± 3% +3.19% (p=0.003 n=9+9) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_100_spans_with_50_ranges_each-24 48.7M ± 2% 49.4M ± 2% ~ (p=0.075 n=10+10) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_100_spans_with_100_ranges_each-24 87.9M ± 1% 88.6M ± 1% ~ (p=0.075 n=10+10) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_200_spans_with_25_ranges_each-24 140M ± 0% 142M ± 0% +1.04% (p=0.000 n=8+9) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_200_spans_with_50_ranges_each-24 248M ± 1% 249M ± 1% +0.65% (p=0.001 n=10+10) SpanStats/3node/BenchmarkSpanStats_-_span_stats_for_3_node_cluster,_collecting_200_spans_with_100_ranges_each-24 306M ± 1% 308M ± 0% +0.57% (p=0.002 n=10+9) ``` Some notable improvements particularly with requests for spans with fewer ranges. After a point, the raw number of ranges becomes the bottleneck, despite reducing the number of fanouts. Not sure if there is a better way to fetch range statistics but I think the improvement here is enough for this PR. If improvements for fetching range statistics are identified, they can be done in a follow up PR and backported. ---- Release note (sql change): span statistics are unavailable on mixed-version clusters 101481: ui: add KeyedCachedDataReducer selector factory util, refactor jobs page props to use RequestState r=xinhaoz a=xinhaoz See individual commits. Epic: none Release note: None Co-authored-by: Thomas Hardy <[email protected]> Co-authored-by: Xin Hao Zhang <[email protected]>
- Loading branch information