Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spanconfig: scale-test for 1M tables #78215

Closed

Conversation

irfansharif
Copy link
Contributor

@irfansharif irfansharif commented Mar 22, 2022

This PR adds an end-to-end scale test for the span configs
infrastructure, demonstrating the ability to get up to 1M tables, with
configs reconciled end-to-end.

dev test pkg/spanconfig -f=TestScalability -v \
  --show-logs --timeout=1000s --stream-output

From test logs:

finished epoch=9 (took 39.356894s); running total: 1000000 tables in 5m37.846084s
...
end-to-end span config reconciliation (post-creation) for 1000048 tables took 1m40.496927s

Release justification: test-only change
Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@irfansharif irfansharif changed the title [wip] spanconfig: scale-test for 1M tables spanconfig: scale-test for 1M tables Apr 5, 2022
We added ForEachOverlappingSpanConfig to the Store interface to be able
to mock it out for testing. We can instead use a testing knob. A future
commit will add test-only helpers on the concrete store implementation
and add plumbing to access it through the concrete kvsubscriber type.
This PR will let us do so without expanding the Store interface further.

Release justification: low risk, high benefit
Release note: None
@irfansharif irfansharif force-pushed the 220321.scalability-test branch from 8a3e6dc to cd6b990 Compare April 14, 2022 18:02
Pure code-movement change.

Release note: None
Release justification: None
@irfansharif irfansharif force-pushed the 220321.scalability-test branch 2 times, most recently from 8640356 to 93bf765 Compare April 14, 2022 19:52
@irfansharif irfansharif marked this pull request as ready for review April 14, 2022 20:08
@irfansharif irfansharif requested review from a team as code owners April 14, 2022 20:08
@irfansharif irfansharif requested review from a team and stevendanna and removed request for a team April 14, 2022 20:08
@irfansharif
Copy link
Contributor Author

irfansharif commented Apr 14, 2022

Here are selected logs from one of these test runs (raw gist here). Few things to understand:

  • finished epoch=2 (took 30.139075s); running total: 300000 tables in 1m27.409066s the test is creating batches of 100k tables in epochs;
  • sqlwatcher received 100000 updates is when the spanconfig.SQLWatcher receives a batch of updates from its rangefeeds over system.{descriptor,zones}, can be pretty slow to react to all the committed tables (not sure why, heavy write activity on those tables?);
  • sqltranslator translated 100000 records in 48.932576s is how long spanconfig.SQLTranslator takes to hydrate each updated desc's zone config (by far the slowest part of the system);
  • incrementally reconciled: +100000 -0 entries in 3.103309s is the reconciler + kvaccessor; both computing diffs and persisting results in system.span_configurations -- pretty fast;
  • spanconfig-subscriber: established range feed cache usually post-buffer overflow on the subscriber after receiving a huge chunk of updates, recovered after spanconfig-subscriber: completed initial scan; flushed 1000048 events -- fast.
I220414 19:40:33.333954 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 80  finished epoch=0 (took 26.451988s); running total: 100000 tables in 26.451988s
I220414 19:40:36.468673 551 spanconfig/spanconfigreconciler/reconciler.go:405  [n1,job=753319302814400513] 86  sqlwatcher received 100000 updates
I220414 19:41:04.151885 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 89  finished epoch=1 (took 30.81778s); running total: 200000 tables in 57.26992s
I220414 19:41:25.401317 551 spanconfig/spanconfigreconciler/reconciler.go:453  [n1,job=753319302814400513] 94  sqltranslator translated 100000 records in 48.932576s
I220414 19:41:28.504736 551 spanconfig/spanconfigreconciler/reconciler.go:487  [n1,job=753319302814400513] 95  incrementally reconciled: +100000 -0 entries in 3.103309s
I220414 19:41:28.692836 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:316  [n1] 97  spanconfig-subscriber: established range feed cache
I220414 19:41:29.251784 551 spanconfig/spanconfigreconciler/reconciler.go:405  [n1,job=753319302814400513] 98  sqlwatcher received 100000 updates
I220414 19:41:29.790269 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:357  [n1] 99  spanconfig-subscriber: completed initial scan; flushed 100048 events
I220414 19:41:34.291031 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 100  finished epoch=2 (took 30.139075s); running total: 300000 tables in 1m27.409066s
I220414 19:42:03.261744 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 107  finished epoch=3 (took 28.97064s); running total: 400000 tables in 1m56.379781s
I220414 19:42:15.608217 551 spanconfig/spanconfigreconciler/reconciler.go:453  [n1,job=753319302814400513] 114  sqltranslator translated 100000 records in 46.356368s
I220414 19:42:18.681697 551 spanconfig/spanconfigreconciler/reconciler.go:487  [n1,job=753319302814400513] 115  incrementally reconciled: +100000 -0 entries in 3.07341s
I220414 19:42:18.891866 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:316  [n1] 117  spanconfig-subscriber: established range feed cache
I220414 19:42:20.041115 551 spanconfig/spanconfigreconciler/reconciler.go:405  [n1,job=753319302814400513] 118  sqlwatcher received 200000 updates
I220414 19:42:20.069966 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:357  [n1] 119  spanconfig-subscriber: completed initial scan; flushed 200048 events
I220414 19:42:33.627004 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 120  finished epoch=4 (took 30.365189s); running total: 500000 tables in 2m26.74504s
I220414 19:43:03.752912 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 125  finished epoch=5 (took 30.125833s); running total: 600000 tables in 2m56.870947s
I220414 19:43:43.059044 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 135  finished epoch=6 (took 39.306052s); running total: 700000 tables in 3m36.177081s
I220414 19:44:06.163368 551 spanconfig/spanconfigreconciler/reconciler.go:453  [n1,job=753319302814400513] 143  sqltranslator translated 200000 records in 1m46.122173s
I220414 19:44:12.341914 551 spanconfig/spanconfigreconciler/reconciler.go:487  [n1,job=753319302814400513] 144  incrementally reconciled: +200000 -0 entries in 6.178454s
I220414 19:44:12.531552 264 kv/kvserver/rangefeed/budget.go:166  [n1,s1,r43/1:/Table/{47-50},raft] 146  trying to get allocation from already closed budget
I220414 19:44:12.708052 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:316  [n1] 147  spanconfig-subscriber: established range feed cache
I220414 19:44:14.339386 551 spanconfig/spanconfigreconciler/reconciler.go:405  [n1,job=753319302814400513] 149  sqlwatcher received 300000 updates
I220414 19:44:16.904133 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:357  [n1] 150  spanconfig-subscriber: completed initial scan; flushed 400048 events
I220414 19:44:25.366817 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 155  finished epoch=7 (took 42.307692s); running total: 800000 tables in 4m18.484853s
I220414 19:45:05.371089 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 165  finished epoch=8 (took 40.004198s); running total: 900000 tables in 4m58.489124s
I220414 19:45:44.728048 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 178  finished epoch=9 (took 39.356894s); running total: 1000000 tables in 5m37.846084s
I220414 19:46:08.250326 551 spanconfig/spanconfigreconciler/reconciler.go:453  [n1,job=753319302814400513] 209  sqltranslator translated 300000 records in 1m53.910851s
I220414 19:46:16.113606 551 spanconfig/spanconfigreconciler/reconciler.go:487  [n1,job=753319302814400513] 213  incrementally reconciled: +300000 -0 entries in 7.863214s
I220414 19:46:16.641413 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:316  [n1] 215  spanconfig-subscriber: established range feed cache
I220414 19:46:19.236393 551 spanconfig/spanconfigreconciler/reconciler.go:405  [n1,job=753319302814400513] 217  sqlwatcher received 300000 updates
I220414 19:46:21.857409 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:357  [n1] 219  spanconfig-subscriber: completed initial scan; flushed 700048 events
I220414 19:47:10.289037 551 spanconfig/spanconfigreconciler/reconciler.go:453  [n1,job=753319302814400513] 241  sqltranslator translated 300000 records in 51.052571s
I220414 19:47:18.256947 551 spanconfig/spanconfigreconciler/reconciler.go:487  [n1,job=753319302814400513] 244  incrementally reconciled: +300000 -0 entries in 7.96784s
I220414 19:47:19.286002 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:316  [n1] 247  spanconfig-subscriber: established range feed cache
I220414 19:47:25.066720 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:357  [n1] 250  spanconfig-subscriber: completed initial scan; flushed 1000048 events

Just looking at individual components:

# SQLTranslator (slow)
I220414 19:41:25.401317 551 spanconfig/spanconfigreconciler/reconciler.go:453  [n1,job=753319302814400513] 94  sqltranslator translated 100000 records in 48.932576s
I220414 19:42:15.608217 551 spanconfig/spanconfigreconciler/reconciler.go:453  [n1,job=753319302814400513] 114  sqltranslator translated 100000 records in 46.356368s
I220414 19:44:06.163368 551 spanconfig/spanconfigreconciler/reconciler.go:453  [n1,job=753319302814400513] 143  sqltranslator translated 200000 records in 1m46.122173s
I220414 19:46:08.250326 551 spanconfig/spanconfigreconciler/reconciler.go:453  [n1,job=753319302814400513] 209  sqltranslator translated 300000 records in 1m53.910851s
I220414 19:47:10.289037 551 spanconfig/spanconfigreconciler/reconciler.go:453  [n1,job=753319302814400513] 241  sqltranslator translated 300000 records in 51.052571s

# Reconciler + KVAccessor (fast)
I220414 19:41:28.504736 551 spanconfig/spanconfigreconciler/reconciler.go:487  [n1,job=753319302814400513] 95  incrementally reconciled: +100000 -0 entries in 3.103309s
I220414 19:42:18.681697 551 spanconfig/spanconfigreconciler/reconciler.go:487  [n1,job=753319302814400513] 115  incrementally reconciled: +100000 -0 entries in 3.07341s
I220414 19:44:12.341914 551 spanconfig/spanconfigreconciler/reconciler.go:487  [n1,job=753319302814400513] 144  incrementally reconciled: +200000 -0 entries in 6.178454s
I220414 19:46:16.113606 551 spanconfig/spanconfigreconciler/reconciler.go:487  [n1,job=753319302814400513] 213  incrementally reconciled: +300000 -0 entries in 7.863214s
I220414 19:47:18.256947 551 spanconfig/spanconfigreconciler/reconciler.go:487  [n1,job=753319302814400513] 244  incrementally reconciled: +300000 -0 entries in 7.96784s

# KVSubscriber (fast -- compare timestamps on successive lines)
I220414 19:41:28.692836 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:316  [n1] 97  spanconfig-subscriber: established range feed cache
I220414 19:41:29.790269 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:357  [n1] 99  spanconfig-subscriber: completed initial scan; flushed 100048 events
I220414 19:42:18.891866 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:316  [n1] 117  spanconfig-subscriber: established range feed cache
I220414 19:42:20.069966 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:357  [n1] 119  spanconfig-subscriber: completed initial scan; flushed 200048 events
I220414 19:44:12.708052 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:316  [n1] 147  spanconfig-subscriber: established range feed cache
I220414 19:44:16.904133 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:357  [n1] 150  spanconfig-subscriber: completed initial scan; flushed 400048 events
I220414 19:46:16.641413 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:316  [n1] 215  spanconfig-subscriber: established range feed cache
I220414 19:46:21.857409 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:357  [n1] 219  spanconfig-subscriber: completed initial scan; flushed 700048 events
I220414 19:47:19.286002 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:316  [n1] 247  spanconfig-subscriber: established range feed cache
I220414 19:47:25.066720 375 kv/kvclient/rangefeed/rangefeedcache/watcher.go:357  [n1] 250  spanconfig-subscriber: completed initial scan; flushed 1000048 events

# SQLWatcher (maybe slow? compare timestamps between when updates are received and when the corresponding # of tables were committed)
I220414 19:40:33.333954 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 80  finished epoch=0 (took 26.451988s); running total: 100000 tables in 26.451988s
I220414 19:40:36.468673 551 spanconfig/spanconfigreconciler/reconciler.go:405  [n1,job=753319302814400513] 86  sqlwatcher received 100000 updates
I220414 19:41:04.151885 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 89  finished epoch=1 (took 30.81778s); running total: 200000 tables in 57.26992s
I220414 19:41:29.251784 551 spanconfig/spanconfigreconciler/reconciler.go:405  [n1,job=753319302814400513] 98  sqlwatcher received 100000 updates
I220414 19:41:34.291031 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 100  finished epoch=2 (took 30.139075s); running total: 300000 tables in 1m27.409066s
I220414 19:42:03.261744 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 107  finished epoch=3 (took 28.97064s); running total: 400000 tables in 1m56.379781s
I220414 19:42:20.041115 551 spanconfig/spanconfigreconciler/reconciler.go:405  [n1,job=753319302814400513] 118  sqlwatcher received 200000 updates
I220414 19:42:33.627004 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 120  finished epoch=4 (took 30.365189s); running total: 500000 tables in 2m26.74504s
I220414 19:43:03.752912 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 125  finished epoch=5 (took 30.125833s); running total: 600000 tables in 2m56.870947s
I220414 19:43:43.059044 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 135  finished epoch=6 (took 39.306052s); running total: 700000 tables in 3m36.177081s
I220414 19:44:14.339386 551 spanconfig/spanconfigreconciler/reconciler.go:405  [n1,job=753319302814400513] 149  sqlwatcher received 300000 updates
I220414 19:44:25.366817 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 155  finished epoch=7 (took 42.307692s); running total: 800000 tables in 4m18.484853s
I220414 19:45:05.371089 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 165  finished epoch=8 (took 40.004198s); running total: 900000 tables in 4m58.489124s
I220414 19:45:44.728048 25 spanconfig_test/pkg/spanconfig/spanconfig_test.go:107  [-] 178  finished epoch=9 (took 39.356894s); running total: 1000000 tables in 5m37.846084s
I220414 19:46:19.236393 551 spanconfig/spanconfigreconciler/reconciler.go:405  [n1,job=753319302814400513] 217  sqlwatcher received 300000 updates

@irfansharif irfansharif requested review from nvanbenschoten, ajwerner and arulajmani and removed request for a team and stevendanna April 14, 2022 20:09
@irfansharif
Copy link
Contributor Author

I'm hoping to check this PR in, though with the test actually skipped given how long running it is. It's mostly a test-only change and knob tuning that I think would make for reasonable defaults. It also includes a few peephole optimizations around the sqlwatcher that shows up prominently in CPU profiles and this end-to-end test. For the additional logging, I'll hide behind a high verbosity flag once LGTM-ed. PTAL.

@irfansharif
Copy link
Contributor Author

Also open to ideas from @cockroachdb/sql-schema on how to speed up committing a high number of tables. I tried a few approaches (large strings, large txns, etc) and I found this to be the fastest. Given table creation alone takes up a non-trivial amount of time (1000000 tables in 5m37.846084s), speeding that up would be nice.

This PR adds an end-to-end scale test for the span configs
infrastructure, demonstrating the ability to get up to 1M tables, with
configs reconciled end-to-end.

    dev test pkg/spanconfig -f=TestScalability -v \
      --show-logs --timeout=1000s --stream-output

From test logs:

    finished epoch=9 (took 39.356894s); running total: 1000000 tables in 5m37.846084s
    ...
    end-to-end span config reconciliation (post-creation) for 1000048 tables took 1m40.496927s

Release justification: test-only change
Release note: None
@irfansharif irfansharif force-pushed the 220321.scalability-test branch from 93bf765 to cc2fae5 Compare April 18, 2022 14:52
@irfansharif
Copy link
Contributor Author

Added an incremental reconciliation step to the test, and seeing what looks like a surprising perf profile in the kvaccessor when batching deletes (total = 100k entries, batch size of 10k):

I220418 14:44:27.941902 510 spanconfig/spanconfigkvaccessor/kvaccessor.go:277  [n1,job=754393361738629121] 161  kvaccessor delete batch for 10000 entries took 21.935536s
I220418 14:44:47.629716 510 spanconfig/spanconfigkvaccessor/kvaccessor.go:277  [n1,job=754393361738629121] 182  kvaccessor delete batch for 10000 entries took 19.687731s
I220418 14:45:05.049859 510 spanconfig/spanconfigkvaccessor/kvaccessor.go:277  [n1,job=754393361738629121] 201  kvaccessor delete batch for 10000 entries took 17.420078s
I220418 14:45:20.099903 510 spanconfig/spanconfigkvaccessor/kvaccessor.go:277  [n1,job=754393361738629121] 217  kvaccessor delete batch for 10000 entries took 15.049966s
I220418 14:45:32.870542 510 spanconfig/spanconfigkvaccessor/kvaccessor.go:277  [n1,job=754393361738629121] 231  kvaccessor delete batch for 10000 entries took 12.770569s
I220418 14:45:43.363427 510 spanconfig/spanconfigkvaccessor/kvaccessor.go:277  [n1,job=754393361738629121] 243  kvaccessor delete batch for 10000 entries took 10.492817s
I220418 14:45:51.714540 510 spanconfig/spanconfigkvaccessor/kvaccessor.go:277  [n1,job=754393361738629121] 253  kvaccessor delete batch for 10000 entries took 8.351056s
I220418 14:45:57.768193 510 spanconfig/spanconfigkvaccessor/kvaccessor.go:277  [n1,job=754393361738629121] 260  kvaccessor delete batch for 10000 entries took 6.053576s
I220418 14:46:01.508711 510 spanconfig/spanconfigkvaccessor/kvaccessor.go:277  [n1,job=754393361738629121] 265  kvaccessor delete batch for 10000 entries took 3.740439s
I220418 14:46:02.982147 510 spanconfig/spanconfigkvaccessor/kvaccessor.go:277  [n1,job=754393361738629121] 267  kvaccessor delete batch for 10000 entries took 1.473374s

Is there an easy way to get an EXPLAIN {,ANALYZE} trace out of internal queries like this? Issued through the internal executor in the example above.

@irfansharif
Copy link
Contributor Author

https://gist.github.com/irfansharif/5d00600de25b747f0c8348d598675e59 Here's the trace for the slow statement. I'm not sure I'm seeing where the latency spikes are coming from. Is it flow scheduling?

  1661.032ms      0.258ms                    event:kv/kvserver/pkg/kv/kvserver/replica_raft.go:114 [n1,s1,r43/1:/Table/{47-50}] evaluated request
  1661.039ms      0.007ms                    event:kv/kvserver/pkg/kv/kvserver/replica_raft.go:158 [n1,s1,r43/1:/Table/{47-50}] proposing command to write 0 new keys, 8878 new values, 8878 new intents, write batch size=1331386 bytes
  1662.004ms      0.965ms                    event:kv/kvserver/pkg/kv/kvserver/replica_raft.go:440 [n1,s1,r43/1:/Table/{47-50}] proposal is large: 1.8 MiB
  1461.341ms   -200.664ms                    structured:{"@type":"type.googleapis.com/cockroach.roachpb.ScanStats","numInterfaceSeeks":"1","numInternalSeeks":"1","numInterfaceSteps":"0","numInternalSteps":"0","blockBytes":"8538","blockBytesInCache":"8538","keyBytes":"9","valueBytes":"19","pointCount":"1","pointsCoveredByRangeTombstones":"0"}
  1663.354ms    202.013ms                        === operation:local proposal _verbose:1 node:1 raft: range:43/1:/Table/{47-50} store:1
  1664.813ms      1.460ms                        event:kv/kvserver/pkg/kv/kvserver/replica_application_state_machine.go:496 [n1,s1,r43/1:/Table/{47-50},raft] applying command
  1675.646ms     10.833ms                        event:kv/kvserver/pkg/kv/kvserver/replica_application_state_machine.go:1173 [n1,s1,r43/1:/Table/{47-50},raft] LocalResult (reply: (err: <nil>), *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, ... 8853 skipped ..., *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, *roachpb.DeleteResponse, #encountered intents: 0, #acquired locks: 8878, #resolved locks: 0#updated txns: 0 #end txns: 0, GossipFirstRange:false MaybeGossipSystemConfig:false MaybeGossipSystemConfigIfHaveFailure:false MaybeAddToSplitQueue:false MaybeGossipNodeLiveness:<nil>
 22479.576ms  22436.279ms        event:component:<flow_id:<e6ff4146-40db-475f-a2a6-056272493038> type:PROCESSOR id:0 sql_instance_id:1 > net_rx:<latency:<> wait_time:<> deserialization_time:<> tuples_received:<> bytes_received:<> messages_received:<> > net_tx:<tuples_sent:<> bytes_sent:<> messages_sent:<> > kv:<bytes_read:<value_plus_one:6271885 > tuples_read:<value_plus_one:100049 > kv_time:<value_plus_one:<nanos:445616001 > > contention_time:<value_plus_one:<nanos:1 > > num_interface_steps:<value_plus_one:1 > num_internal_steps:<value_plus_one:200001 > num_interface_seeks:<value_plus_one:200099 > num_internal_seeks:<value_plus_one:100051 > > exec:<exec_time:<> max_allocated_mem:<value_plus_one:6354105 > max_allocated_disk:<> > output:<num_batches:<value_plus_one:108 > num_tuples:<value_plus_one:100049 > > flow_stats:<max_mem_usage:<> max_disk_usage:<> >
 22479.612ms      0.036ms        event:component:<flow_id:<00000000-0000-0000-0000-000000000000> type:UNSET id:0 sql_instance_id:0 > net_rx:<latency:<> wait_time:<> deserialization_time:<> tuples_received:<> bytes_received:<> messages_received:<> > net_tx:<tuples_sent:<> bytes_sent:<> messages_sent:<> > kv:<bytes_read:<> tuples_read:<> kv_time:<> contention_time:<> num_interface_steps:<> num_internal_steps:<> num_interface_seeks:<> num_internal_seeks:<> > exec:<exec_time:<> max_allocated_mem:<> max_allocated_disk:<> > output:<num_batches:<> num_tuples:<> > flow_stats:<max_mem_usage:<> max_disk_usage:<> >
 22479.636ms      0.024ms        event:component:<flow_id:<e6ff4146-40db-475f-a2a6-056272493038> type:PROCESSOR id:1 sql_instance_id:1 > net_rx:<latency:<> wait_time:<> deserialization_time:<> tuples_received:<> bytes_received:<> messages_received:<> > net_tx:<tuples_sent:<> bytes_sent:<> messages_sent:<> > kv:<bytes_read:<> tuples_read:<> kv_time:<> contention_time:<> num_interface_steps:<> num_internal_steps:<> num_interface_seeks:<> num_internal_seeks:<> > exec:<exec_time:<value_plus_one:<seconds:21 nanos:758852001 > > max_allocated_mem:<> max_allocated_disk:<> > output:<num_batches:<value_plus_one:15 > num_tuples:<value_plus_one:10001 > > flow_stats:<max_mem_usage:<> max_disk_usage:<> >
 22479.664ms      0.028ms        event:component:<flow_id:<00000000-0000-0000-0000-000000000000> type:UNSET id:0 sql_instance_id:0 > net_rx:<latency:<> wait_time:<> deserialization_time:<> tuples_received:<> bytes_received:<> messages_received:<> > net_tx:<tuples_sent:<> bytes_sent:<> messages_sent:<> > kv:<bytes_read:<> tuples_read:<> kv_time:<> contention_time:<> num_interface_steps:<> num_internal_steps:<> num_interface_seeks:<> num_internal_seeks:<> > exec:<exec_time:<> max_allocated_mem:<> max_allocated_disk:<> > output:<num_batches:<> num_tuples:<> > flow_stats:<max_mem_usage:<> max_disk_usage:<> >
 22479.701ms      0.022ms            === operation:txn coordinator send _verbose:1 intExec:delete-span-cfgs job:754398081661698049 node:1 txnID:a18d20a2-2917-4bf3-8705-5d00639ea2be
 22479.904ms      0.203ms                === operation:dist sender send _verbose:1 intExec:delete-span-cfgs job:754398081661698049 node:1 txn:a18d20a2
 22479.965ms      0.076ms                event:kv/kvclient/kvcoord/range_iter.go:175 [n1,job=754398081661698049,intExec=delete-span-cfgs,txn=a18d20a2] querying next range at /Table/47/1/"\xf7#\x18"/0

@irfansharif irfansharif force-pushed the 220321.scalability-test branch from cc2fae5 to 9780ef9 Compare April 18, 2022 15:22
@irfansharif
Copy link
Contributor Author

 missing stats
 table: span_configurations@primary
 spans: FULL SCAN

A hacked in EXPLAIN statement suggests it’s due to us falling back to full table scans due to missing stats. In the test I tried an ANALYZE system.span_configurations to try and seed it with statistics, but that didn’t help either. I'm not sure what else I'd need to be doing to seed the table with statistics enough to not need to fall back on the full table scan. Is it that the statistics aren't fine grained enough for the optimizer to be able to use constrained index scans? Something else?

@ajwerner
Copy link
Contributor

We never collect stats on system tables.

@irfansharif irfansharif force-pushed the 220321.scalability-test branch from 9780ef9 to e5b8d33 Compare April 18, 2022 17:05
@irfansharif irfansharif force-pushed the 220321.scalability-test branch 2 times, most recently from 513bc23 to 85c3046 Compare April 18, 2022 18:23
@irfansharif
Copy link
Contributor Author

In the test I tried an ANALYZE system.span_configurations to try and seed it with statistics, but that didn’t help either.

Filed #80123. After manually creating stats and some SQL code change to actually use it for system.span_configurations specifically (not something we want to check-in, probably want a fuller treatment over in #80123), I'm seeing query plans that end up using the computed statistics.

Added an incremental reconciliation step to the test, and seeing what looks like a surprising perf profile in the kvaccessor when batching deletes (total = 100k entries, batch size of 10k)

This problem goes away once using table stats; we'd no longer be doing full table scans when deleting span configs. It'd be nice to work on #80123 soon, and to also perhaps backport a minimal form to 22.1.x.

Timing the reconciler + kvaccessor again with system table stats after altering RANGE DEFAULT:

incrementally reconciled: +1000000 -1000000 entries in 3m31.53276s

It's still a bit slower than the initial reconciliation pass:

initial span config reconciliation for 1000048 tables took 1m42.876005s

Where the additional time can be attributed to all the entries we have to delete (in the initial pass we blindly upsert). The end-to-end time post change to RANGE DEFAULT can still be attributed to the sql translator:

sqltranslator translated 1000000 records in 5m37.219766s

And some instrumentation around kvaccessor delete batches, which has a
surprising perf profile:

   kvaccessor delete batch for 10000 entries took 21.935536s
   kvaccessor delete batch for 10000 entries took 19.687731s
   kvaccessor delete batch for 10000 entries took 17.420078s
   kvaccessor delete batch for 10000 entries took 15.049966s
   kvaccessor delete batch for 10000 entries took 12.770569s
   kvaccessor delete batch for 10000 entries took 10.492817s
   kvaccessor delete batch for 10000 entries took 8.351056s
   kvaccessor delete batch for 10000 entries took 6.053576s
   kvaccessor delete batch for 10000 entries took 3.740439s
   kvaccessor delete batch for 10000 entries took 1.473374s

Release note: None
@irfansharif
Copy link
Contributor Author

Where the additional time can be attributed to all the entries we have to delete (in the initial pass we blindly upsert).

#80189 is a possible optimization and in our run above, will make the incremental pass as fast as the initial one.

@irfansharif irfansharif marked this pull request as draft May 10, 2022 16:37
@irfansharif
Copy link
Contributor Author

I'm not going to have time to take this further. The results above were still useful and drove targeted optimizations+hinted at others. Perhaps once the schema folks are interested in advertising greater #s of tables, we can revive (or rewrite) a form of this test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants