roachtest: replicate/wide failed #98385

cockroach-teamcity · 2023-03-10T17:06:58Z

roachtest.replicate/wide failed with artifacts on master @ d4a584e49f0b1ca89738376090939d7669c3b3db:

test artifacts and logs in: /artifacts/replicate/wide/run_1
(allocator.go:440).runWideReplication: expected 0 mis-replicated ranges, but found 19

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=1 , ROACHTEST_encrypted=false , ROACHTEST_fs=ext4 , ROACHTEST_localSSD=true , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

/cc @cockroachdb/kv-triage _{This test on roachdash | Improve this report!

Jira issue: CRDB-25240}

The text was updated successfully, but these errors were encountered:

kvoli · 2023-03-10T20:13:55Z

xref #97794

kvoli · 2023-03-10T23:01:24Z

I've tried reproducing this 5 times and haven't been able to.

There a couple of issues, however the cause of this failure is due to the default span config being used (3x replication). This only occurs for a short period of time right after the circuit breakers reset and connection is established to the rest of the nodes. The default span config causes the replicate queue to remove voters, despite the actual replication factor being 9x. Voter removal stops once the span config is correct for the range and not the default (3x).

Whilst I wasn't able to reproduce the failure, I was able to reproduce the default span config being seen for a short period (much shorter than in the test failure).

Consider this timeline of events:

restart node
circuit breakers trip
circuit breakers reset 
gossip connection established and liveness established
default span config applied
... remove voter actions returned.
... remove voter actions acted upon causing 8x replication instead of 9x
spanconfig-subscriber established
... consider rebalance actions or add voter returned (for replicas removed)

The default replication factor can be seen in the distribution logging, specifically ‹remove voter› - need=3, have=9,.

Distribution Logs

I230310 22:07:04.281788 13756 13@kv/kvserver/replicate_queue.go:626 ⋮ [T1,n3,replicate,s3,r24/10:‹/Table/2{2-3}›] 2880  repair needed (‹range unavailable›), enqueuing

... gossip established, node liveness established, default span config returned for all ranges

I230310 22:07:04.288304 13768 13@kv/kvserver/allocator/allocatorimpl/allocator.go:1034 ⋮ [T1,n3,replicate,s3,r55/2:‹/Table/5{4-5}›] 2881  ‹remove voter› - need=3, have=9, priority=799.00
I230310 22:07:04.288355 13768 13@kv/kvserver/replicate_queue.go:626 ⋮ [T1,n3,replicate,s3,r55/2:‹/Table/5{4-5}›] 2882  repair needed (‹remove voter›), enqueuing

... span config watcher starts up, correct span config returned for all ranges

I230310 22:07:04.374981 13871 13@kv/kvserver/replicate_queue.go:664 ⋮ [T1,n3,replicate,s3,r25/12:‹/Table/2{3-4}›] 2889  no rebalance target found, not enqueuing

Cockroach Logs

Cockroach Logs - these show when gossip, liveness and eventually span configs are established:

I230310 22:07:04.061346 13344 gossip/client.go:124 ⋮ [T1,n3] 127  started gossip client to n0 (‹35.229.107.180:26257›)
I230310 22:07:04.062136 13344 gossip/client.go:129 ⋮ [T1,n3] 128  closing client to n1 (‹35.229.107.180:26257›): stopping outgoing client to n1 (‹35.229.107.180:26257›); already have incoming
I230310 22:07:04.062195 124 1@gossip/gossip.go:1420 ⋮ [T1,n3] 129  node has connected to cluster via gossip
I230310 22:07:04.062779 124 kv/kvserver/stores.go:282 ⋮ [T1,n3] 130  wrote 4 node addresses to persistent storage
I230310 22:07:04.072183 13358 kv/kvserver/liveness/liveness.go:1224 ⋮ [T1,n3,s3,r3/8:‹/System/{NodeLive…-tsd}›] 131  incremented n4 liveness epoch to 3
I230310 22:07:04.324725 13799 kv/kvclient/rangefeed/rangefeedcache/watcher.go:335 ⋮ [T1,n3] 132  spanconfig-subscriber: established range feed cache
I230310 22:07:04.371446 36 1@util/log/event_log.go:32 ⋮ [T1,n3] 133 ={"Timestamp":1678486024371429052,"EventType":"node_restart","NodeID":3,"StartedAt":1678486016052306715,"LastUp":1678486006479237816}
I230310 22:07:04.372991 13824 kv/kvclient/rangefeed/rangefeedcache/watcher.go:335 ⋮ [T1,n3] 134  settings-watcher: established range feed cache

A fix for this is to never use the default span config and instead just error out very loudly.

If a store is connected to gossip and also liveness, however is unable to retrieve a span config for a range - we can't make any decisions.

Using incorrect span configs seems worse than just doing nothing in many cases.

Another issue is that when restarting every node, the rpc circuit breakers kick in causing essentially a stall for 15 seconds.

irfansharif · 2023-03-10T23:55:58Z

Going to get rid of the release blocker since this bug exists in older releases already.

kvoli · 2023-03-10T23:56:45Z

Filed #98421

...subscribed to span configs. Do the same for the store rebalancer. We applied this treatment for the merge queue back in cockroachdb#78122 since the fallback behavior, if not subscribed, is to use the statically defined span config for every operation. - For the replicate queue this mean obtusely applying a replication factor of 3, regardless of configuration. This was possible typically post node restart before subscription was initially established. We saw this in cockroachdb#98385. It was possible then for us to ignore configured voter/non-voter/lease constraints. - For the split queue, we wouldn't actually compute any split keys if unsubscribed, so the missing check was somewhat benign. But we would be obtusely applying the default range sizes [128MiB,512MiB], so for clusters configured with larger range sizes, this could lead to a burst of splitting post node-restart. - For the MVCC GC queue, it would mean applying the the statically configured default GC TTL and ignoring any set protected timestamps. The latter is best-effort protection but could result in internal operations relying on protection (like backups, changefeeds) failing informatively. For clusters configured with GC TTL greater than the default, post node-restart it could lead to a burst of MVCC GC activity and AOST queries failing to find expected data. - For the store rebalancer, ignoring span configs could result in violating lease preferences and voter constraints. Fixes cockroachdb#98421. Fixes cockroachdb#98385. Release note (bug fix): It was previously possible for CockroachDB to not respect non-default zone configs. This only happened for a short window after nodes with existing replicas were restarted, and self-rectified within seconds. This manifested in a few ways: - If num_replicas was set to something other than 3, we would still add or remove replicas to get to 3x replication. - If num_voters was set explicitly to get a mix of voting and non-voting replicas, it would be ignored. CockroachDB could possibly remove non-voting replicas. - If range_min_bytes or range_max_bytes were changed from 128 MiB and 512 MiB respectively, we would instead try to size ranges to be within [128 MiB, 512MiB]. This could appear as an excess amount of range splits or merges, as visible in the Replication Dashboard under "Range Operations". - If gc.ttlseconds was set to something other than 90000 seconds, we would still GC data only older than 90000s/25h. If the GC TTL was set to something larger than 25h, AOST queries going further back may now start failing. For GC TTLs less than the 25h default, clusters would observe increased disk usage due to more retained garbage. - If constraints, lease_preferences or voter_constraints were set, they would be ignored. Range data and leases would possibly be moved outside where prescribed. This issues only lasted a few seconds post node-restarts, and any zone config violations were rectified shortly after.

cockroach-teamcity · 2023-03-13T17:19:23Z

roachtest.replicate/wide failed with artifacts on master @ e4924e2b9be4a36d466beab53a80df9241df4783:

test artifacts and logs in: /artifacts/replicate/wide/run_1
(test_runner.go:990).runTest: test timed out (10m0s)
(cluster.go:1896).Start: ~ COCKROACH_CONNECT_TIMEOUT=0 ./cockroach sql --url 'postgres://root@localhost:26257?sslmode=disable' -e "CREATE SCHEDULE IF NOT EXISTS test_only_backup FOR BACKUP INTO 'gs://cockroachdb-backup-testing/roachprod-scheduled-backups/teamcity-9029456-1678684872-112-n9cpu1/1678723870068004190?AUTH=implicit' RECURRING '*/15 * * * *' FULL BACKUP '@hourly' WITH SCHEDULE OPTIONS first_run = 'now'"
ERROR: server closed the connection.
Is this a CockroachDB node?
unexpected EOF
Failed running "sql": COMMAND_PROBLEM: ssh verbose log retained in ssh_161110.068043100_n1_init-backup-schedule.log: exit status 1

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=1 , ROACHTEST_encrypted=false , ROACHTEST_fs=ext4 , ROACHTEST_localSSD=true , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

_{This test on roachdash | Improve this report!}

nvanbenschoten · 2023-03-13T18:22:54Z

Adding the GA-blocker back in for tracking purposes, though it looks like this will be closed shortly anyway.

98261: sql: add crdb_internal views for system statistics tables r=ericharmeling a=ericharmeling This commit adds two new crdb_internal views: - crdb_internal.statement_statistics_persisted, which surfaces the system.statement_statistics table - crdb_internal.transaction_statistics_persisted, which surfaces the system.transaction_statistics table Example output from after flush: ``` [email protected]:26257/insights> select * from crdb_internal.statement_statistics_persisted limit 3; -[ RECORD 1 ] aggregated_ts | 2023-03-08 23:00:00+00 fingerprint_id | \x3ab7869b0bc4aa5a transaction_fingerprint_id | \x95d43bd78dc51d85 plan_hash | \x9aec25074eb1e3a0 app_name | $ cockroach sql node_id | 1 agg_interval | 01:00:00 metadata | {"db": "insights", "distsql": true, "failed": false, "fullScan": true, "implicitTxn": true, "query": "SELECT * FROM crdb_internal.statement_statistics_persisted", "querySummary": "SELECT * FROM crdb_internal.statement_statis...", "stmtTyp": "TypeDML", "vec": true} statistics | {"execution_statistics": {"cnt": 1, "contentionTime": {"mean": 0, "sqDiff": 0}, "cpuSQLNanos": {"mean": 24667, "sqDiff": 0}, "maxDiskUsage": {"mean": 0, "sqDiff": 0}, "maxMemUsage": {"mean": 2.048E+4, "sqDiff": 0}, "mvccIteratorStats": {"blockBytes": {"mean": 0, "sqDiff": 0}, "blockBytesInCache": {"mean": 0, "sqDiff": 0}, "keyBytes": {"mean": 0, "sqDiff": 0}, "pointCount": {"mean": 0, "sqDiff": 0}, "pointsCoveredByRangeTombstones": {"mean": 0, "sqDiff": 0}, "rangeKeyContainedPoints": {"mean": 0, "sqDiff": 0}, "rangeKeyCount": {"mean": 0, "sqDiff": 0}, "rangeKeySkippedPoints": {"mean": 0, "sqDiff": 0}, "seekCount": {"mean": 1, "sqDiff": 0}, "seekCountInternal": {"mean": 1, "sqDiff": 0}, "stepCount": {"mean": 0, "sqDiff": 0}, "stepCountInternal": {"mean": 0, "sqDiff": 0}, "valueBytes": {"mean": 0, "sqDiff": 0}}, "networkBytes": {"mean": 0, "sqDiff": 0}, "networkMsgs": {"mean": 0, "sqDiff": 0}}, "index_recommendations": [], "statistics": {"bytesRead": {"mean": 0, "sqDiff": 0}, "cnt": 1, "firstAttemptCnt": 1, "idleLat": {"mean": 0, "sqDiff": 0}, "indexes": ["42@1"], "lastErrorCode": "", "lastExecAt": "2023-03-08T23:14:04.614242Z", "latencyInfo": {"max": 0.001212208, "min": 0.001212208, "p50": 0, "p90": 0, "p99": 0}, "maxRetries": 0, "nodes": [1], "numRows": {"mean": 0, "sqDiff": 0}, "ovhLat": {"mean": 0.000007791999999999955, "sqDiff": 0}, "parseLat": {"mean": 0.000016666, "sqDiff": 0}, "planGists": ["AgFUAgD/FwAAAAcYBhg="], "planLat": {"mean": 0.000691666, "sqDiff": 0}, "regions": ["us-east1"], "rowsRead": {"mean": 0, "sqDiff": 0}, "rowsWritten": {"mean": 0, "sqDiff": 0}, "runLat": {"mean": 0.000496084, "sqDiff": 0}, "svcLat": {"mean": 0.001212208, "sqDiff": 0}}} plan | {"Children": [], "Name": ""} index_recommendations | {} indexes_usage | ["42@1"] -[ RECORD 2 ] aggregated_ts | 2023-03-08 23:00:00+00 fingerprint_id | \x44c9fdb49be676cf transaction_fingerprint_id | \xc1efcc0bba0909f8 plan_hash | \x780a1ba35976b15d app_name | insights node_id | 1 agg_interval | 01:00:00 metadata | {"db": "insights", "distsql": false, "failed": false, "fullScan": false, "implicitTxn": false, "query": "UPDATE insights_workload_table_0 SET balance = balance + $1 WHERE id = $1", "querySummary": "UPDATE insights_workload_table_0 SET balance = balan... WHERE id = $1", "stmtTyp": "TypeDML", "vec": true} statistics | {"execution_statistics": {"cnt": 28, "contentionTime": {"mean": 0, "sqDiff": 0}, "cpuSQLNanos": {"mean": 402538.75, "sqDiff": 1160598792985.25}, "maxDiskUsage": {"mean": 0, "sqDiff": 0}, "maxMemUsage": {"mean": 4.096E+4, "sqDiff": 0}, "mvccIteratorStats": {"blockBytes": {"mean": 31570.321428571428, "sqDiff": 20932497128.107143}, "blockBytesInCache": {"mean": 0, "sqDiff": 0}, "keyBytes": {"mean": 0, "sqDiff": 0}, "pointCount": {"mean": 6.857142857142857, "sqDiff": 435.42857142857133}, "pointsCoveredByRangeTombstones": {"mean": 0, "sqDiff": 0}, "rangeKeyContainedPoints": {"mean": 0, "sqDiff": 0}, "rangeKeyCount": {"mean": 0, "sqDiff": 0}, "rangeKeySkippedPoints": {"mean": 0, "sqDiff": 0}, "seekCount": {"mean": 2, "sqDiff": 0}, "seekCountInternal": {"mean": 2, "sqDiff": 0}, "stepCount": {"mean": 0, "sqDiff": 0}, "stepCountInternal": {"mean": 4.857142857142857, "sqDiff": 435.42857142857133}, "valueBytes": {"mean": 360.32142857142856, "sqDiff": 756476.107142857}}, "networkBytes": {"mean": 0, "sqDiff": 0}, "networkMsgs": {"mean": 0, "sqDiff": 0}}, "index_recommendations": [], "statistics": {"bytesRead": {"mean": 159.04887361588396, "sqDiff": 3909.7441771668127}, "cnt": 2619, "firstAttemptCnt": 2619, "idleLat": {"mean": 0.021495726165330273, "sqDiff": 36.39774900003032}, "indexes": ["106@1"], "lastErrorCode": "", "lastExecAt": "2023-03-08T23:31:03.079093Z", "latencyInfo": {"max": 1.724660916, "min": 0.0001765, "p50": 0.000757916, "p90": 0.00191375, "p99": 0.004730417}, "maxRetries": 0, "nodes": [1], "numRows": {"mean": 1, "sqDiff": 0}, "ovhLat": {"mean": 0.0000018584035891561339, "sqDiff": 3.132932109484058E-7}, "parseLat": {"mean": 0, "sqDiff": 0}, "planGists": ["AgHUAQIADwIAAAcKBQoh1AEAAA=="], "planLat": {"mean": 0.0002562748900343638, "sqDiff": 0.0002118085353898781}, "regions": ["us-east1"], "rowsRead": {"mean": 1, "sqDiff": 0}, "rowsWritten": {"mean": 1, "sqDiff": 0}, "runLat": {"mean": 0.0024048477613592997, "sqDiff": 4.850230671161608}, "svcLat": {"mean": 0.0026629810549828195, "sqDiff": 4.852464499918359}}} plan | {"Children": [], "Name": ""} index_recommendations | {} indexes_usage | ["106@1"] -[ RECORD 3 ] aggregated_ts | 2023-03-08 23:00:00+00 fingerprint_id | \x54202c7b75a5ba87 transaction_fingerprint_id | \x0000000000000000 plan_hash | \xbee0e52ec8c08bdd app_name | $$ $ cockroach demo node_id | 1 agg_interval | 01:00:00 metadata | {"db": "insights", "distsql": false, "failed": false, "fullScan": false, "implicitTxn": false, "query": "INSERT INTO system.jobs(id, created, status, payload, progress, claim_session_id, claim_instance_id, job_type) VALUES ($1, $1, __more1_10__)", "querySummary": "INSERT INTO system.jobs(id, created, st...)", "stmtTyp": "TypeDML", "vec": true} statistics | {"execution_statistics": {"cnt": 1, "contentionTime": {"mean": 0, "sqDiff": 0}, "cpuSQLNanos": {"mean": 300625, "sqDiff": 0}, "maxDiskUsage": {"mean": 0, "sqDiff": 0}, "maxMemUsage": {"mean": 1.024E+4, "sqDiff": 0}, "mvccIteratorStats": {"blockBytes": {"mean": 0, "sqDiff": 0}, "blockBytesInCache": {"mean": 0, "sqDiff": 0}, "keyBytes": {"mean": 0, "sqDiff": 0}, "pointCount": {"mean": 0, "sqDiff": 0}, "pointsCoveredByRangeTombstones": {"mean": 0, "sqDiff": 0}, "rangeKeyContainedPoints": {"mean": 0, "sqDiff": 0}, "rangeKeyCount": {"mean": 0, "sqDiff": 0}, "rangeKeySkippedPoints": {"mean": 0, "sqDiff": 0}, "seekCount": {"mean": 0, "sqDiff": 0}, "seekCountInternal": {"mean": 0, "sqDiff": 0}, "stepCount": {"mean": 0, "sqDiff": 0}, "stepCountInternal": {"mean": 0, "sqDiff": 0}, "valueBytes": {"mean": 0, "sqDiff": 0}}, "networkBytes": {"mean": 0, "sqDiff": 0}, "networkMsgs": {"mean": 0, "sqDiff": 0}}, "index_recommendations": [], "statistics": {"bytesRead": {"mean": 0, "sqDiff": 0}, "cnt": 1, "firstAttemptCnt": 1, "idleLat": {"mean": 9223372036.854776, "sqDiff": 0}, "indexes": [], "lastErrorCode": "", "lastExecAt": "2023-03-08T23:13:25.132671Z", "latencyInfo": {"max": 0.000589375, "min": 0.000589375, "p50": 0, "p90": 0, "p99": 0}, "maxRetries": 0, "nodes": [1], "numRows": {"mean": 1, "sqDiff": 0}, "ovhLat": {"mean": 0.0000016249999999999988, "sqDiff": 0}, "parseLat": {"mean": 0, "sqDiff": 0}, "planGists": ["AiACHgA="], "planLat": {"mean": 0.000203792, "sqDiff": 0}, "regions": ["us-east1"], "rowsRead": {"mean": 0, "sqDiff": 0}, "rowsWritten": {"mean": 1, "sqDiff": 0}, "runLat": {"mean": 0.000383958, "sqDiff": 0}, "svcLat": {"mean": 0.000589375, "sqDiff": 0}}} plan | {"Children": [], "Name": ""} index_recommendations | {} indexes_usage | [] Time: 4ms total (execution 3ms / network 1ms) [email protected]:26257/insights> select * from crdb_internal.transaction_statistics_persisted limit 3; -[ RECORD 1 ] aggregated_ts | 2023-03-08 23:00:00+00 fingerprint_id | \x17d80cf128571d63 app_name | $ internal-migration-job-mark-job-succeeded node_id | 1 agg_interval | 01:00:00 metadata | {"stmtFingerprintIDs": ["b8bbb1bdae56aabc"]} statistics | {"execution_statistics": {"cnt": 1, "contentionTime": {"mean": 0, "sqDiff": 0}, "cpuSQLNanos": {"mean": 64709, "sqDiff": 0}, "maxDiskUsage": {"mean": 0, "sqDiff": 0}, "maxMemUsage": {"mean": 1.024E+4, "sqDiff": 0}, "mvccIteratorStats": {"blockBytes": {"mean": 0, "sqDiff": 0}, "blockBytesInCache": {"mean": 0, "sqDiff": 0}, "keyBytes": {"mean": 0, "sqDiff": 0}, "pointCount": {"mean": 0, "sqDiff": 0}, "pointsCoveredByRangeTombstones": {"mean": 0, "sqDiff": 0}, "rangeKeyContainedPoints": {"mean": 0, "sqDiff": 0}, "rangeKeyCount": {"mean": 0, "sqDiff": 0}, "rangeKeySkippedPoints": {"mean": 0, "sqDiff": 0}, "seekCount": {"mean": 0, "sqDiff": 0}, "seekCountInternal": {"mean": 0, "sqDiff": 0}, "stepCount": {"mean": 0, "sqDiff": 0}, "stepCountInternal": {"mean": 0, "sqDiff": 0}, "valueBytes": {"mean": 0, "sqDiff": 0}}, "networkBytes": {"mean": 0, "sqDiff": 0}, "networkMsgs": {"mean": 0, "sqDiff": 0}}, "statistics": {"bytesRead": {"mean": 0, "sqDiff": 0}, "cnt": 6, "commitLat": {"mean": 0, "sqDiff": 0}, "idleLat": {"mean": 0, "sqDiff": 0}, "maxRetries": 0, "numRows": {"mean": 1, "sqDiff": 0}, "retryLat": {"mean": 0, "sqDiff": 0}, "rowsRead": {"mean": 0, "sqDiff": 0}, "rowsWritten": {"mean": 1, "sqDiff": 0}, "svcLat": {"mean": 0.00026919450000000006, "sqDiff": 1.7615729685500012E-8}}} -[ RECORD 2 ] aggregated_ts | 2023-03-08 23:00:00+00 fingerprint_id | \x2b024f7e2567a238 app_name | $ internal-get-job-row node_id | 1 agg_interval | 01:00:00 metadata | {"stmtFingerprintIDs": ["8461f232a36615e7"]} statistics | {"execution_statistics": {"cnt": 1, "contentionTime": {"mean": 0, "sqDiff": 0}, "cpuSQLNanos": {"mean": 50835, "sqDiff": 0}, "maxDiskUsage": {"mean": 0, "sqDiff": 0}, "maxMemUsage": {"mean": 3.072E+4, "sqDiff": 0}, "mvccIteratorStats": {"blockBytes": {"mean": 0, "sqDiff": 0}, "blockBytesInCache": {"mean": 0, "sqDiff": 0}, "keyBytes": {"mean": 0, "sqDiff": 0}, "pointCount": {"mean": 3, "sqDiff": 0}, "pointsCoveredByRangeTombstones": {"mean": 0, "sqDiff": 0}, "rangeKeyContainedPoints": {"mean": 0, "sqDiff": 0}, "rangeKeyCount": {"mean": 0, "sqDiff": 0}, "rangeKeySkippedPoints": {"mean": 0, "sqDiff": 0}, "seekCount": {"mean": 1, "sqDiff": 0}, "seekCountInternal": {"mean": 1, "sqDiff": 0}, "stepCount": {"mean": 3, "sqDiff": 0}, "stepCountInternal": {"mean": 3, "sqDiff": 0}, "valueBytes": {"mean": 186, "sqDiff": 0}}, "networkBytes": {"mean": 0, "sqDiff": 0}, "networkMsgs": {"mean": 0, "sqDiff": 0}}, "statistics": {"bytesRead": {"mean": 284.81818181818176, "sqDiff": 3465.636363636355}, "cnt": 11, "commitLat": {"mean": 0.000003469727272727273, "sqDiff": 4.946789218181818E-11}, "idleLat": {"mean": 0, "sqDiff": 0}, "maxRetries": 0, "numRows": {"mean": 1, "sqDiff": 0}, "retryLat": {"mean": 0, "sqDiff": 0}, "rowsRead": {"mean": 1, "sqDiff": 0}, "rowsWritten": {"mean": 0, "sqDiff": 0}, "svcLat": {"mean": 0.0006771060909090909, "sqDiff": 8.91510436082909E-7}}} -[ RECORD 3 ] aggregated_ts | 2023-03-08 23:00:00+00 fingerprint_id | \x37e130a1df20d299 app_name | $ internal-create-stats node_id | 1 agg_interval | 01:00:00 metadata | {"stmtFingerprintIDs": ["98828ded59216546"]} statistics | {"execution_statistics": {"cnt": 1, "contentionTime": {"mean": 0, "sqDiff": 0}, "cpuSQLNanos": {"mean": 11875, "sqDiff": 0}, "maxDiskUsage": {"mean": 0, "sqDiff": 0}, "maxMemUsage": {"mean": 1.024E+4, "sqDiff": 0}, "mvccIteratorStats": {"blockBytes": {"mean": 0, "sqDiff": 0}, "blockBytesInCache": {"mean": 0, "sqDiff": 0}, "keyBytes": {"mean": 0, "sqDiff": 0}, "pointCount": {"mean": 0, "sqDiff": 0}, "pointsCoveredByRangeTombstones": {"mean": 0, "sqDiff": 0}, "rangeKeyContainedPoints": {"mean": 0, "sqDiff": 0}, "rangeKeyCount": {"mean": 0, "sqDiff": 0}, "rangeKeySkippedPoints": {"mean": 0, "sqDiff": 0}, "seekCount": {"mean": 0, "sqDiff": 0}, "seekCountInternal": {"mean": 0, "sqDiff": 0}, "stepCount": {"mean": 0, "sqDiff": 0}, "stepCountInternal": {"mean": 0, "sqDiff": 0}, "valueBytes": {"mean": 0, "sqDiff": 0}}, "networkBytes": {"mean": 0, "sqDiff": 0}, "networkMsgs": {"mean": 0, "sqDiff": 0}}, "statistics": {"bytesRead": {"mean": 0, "sqDiff": 0}, "cnt": 1, "commitLat": {"mean": 0.000002291, "sqDiff": 0}, "idleLat": {"mean": 0, "sqDiff": 0}, "maxRetries": 0, "numRows": {"mean": 0, "sqDiff": 0}, "retryLat": {"mean": 0, "sqDiff": 0}, "rowsRead": {"mean": 0, "sqDiff": 0}, "rowsWritten": {"mean": 0, "sqDiff": 0}, "svcLat": {"mean": 0.008680208, "sqDiff": 0}}} Time: 3ms total (execution 2ms / network 1ms) ``` Epic: none Release note: Added two views to the crdb_internal catalog: crdb_internal.statement_statistics_persisted, which surfaces data in the persisted system.statement_statistics table, and crdb_internal.transaction_statistics_persisted, which surfaces the system.transaction_statistics table. 98422: kvserver: disable {split,replicate,mvccGC} queues until... r=irfansharif a=irfansharif ...subscribed to span configs. Do the same for the store rebalancer. We applied this treatment for the merge queue back in #78122 since the fallback behavior, if not subscribed, is to use the statically defined span config for every operation. - For the replicate queue this mean obtusely applying a replication factor of 3, regardless of configuration. This was possible typically post node restart before subscription was initially established. We saw this in #98385. It was possible then for us to ignore configured voter/non-voter/lease constraints. - For the split queue, we wouldn't actually compute any split keys if unsubscribed, so the missing check was somewhat benign. But we would be obtusely applying the default range sizes [128MiB,512MiB], so for clusters configured with larger range sizes, this could lead to a burst of splitting post node-restart. - For the MVCC GC queue, it would mean applying the the statically configured default GC TTL and ignoring any set protected timestamps. The latter is best-effort protection but could result in internal operations relying on protection (like backups, changefeeds) failing informatively. For clusters configured with GC TTL greater than the default, post node-restart it could lead to a burst of MVCC GC activity and AOST queries failing to find expected data. - For the store rebalancer, ignoring span configs could result in violating lease preferences and voter constraints. Fixes #98421. Fixes #98385. Release note (bug fix): It was previously possible for CockroachDB to not respect non-default zone configs. This only happened for a short window after nodes with existing replicas were restarted, and self-rectified within seconds. This manifested in a few ways: - If num_replicas was set to something other than 3, we would still add or remove replicas to get to 3x replication. - If num_voters was set explicitly to get a mix of voting and non-voting replicas, it would be ignored. CockroachDB could possibly remove non-voting replicas. - If range_min_bytes or range_max_bytes were changed from 128 MiB and 512 MiB respectively, we would instead try to size ranges to be within [128 MiB, 512MiB]. This could appear as an excess amount of range splits or merges, as visible in the Replication Dashboard under "Range Operations". - If gc.ttlseconds was set to something other than 90000 seconds, we would still GC data only older than 90000s/25h. If the GC TTL was set to something larger than 25h, AOST queries going further back may now start failing. For GC TTLs less than the 25h default, clusters would observe increased disk usage due to more retained garbage. - If constraints, lease_preferences or voter_constraints were set, they would be ignored. Range data and leases would possibly be moved outside where prescribed. This issues only lasted a few seconds post node-restarts, and any zone config violations were rectified shortly after. 98468: sql: add closest-instance physical planning r=dt a=dt This changes physical planning, specifically how the SQL instance for a given KV node ID is resolved, to be more generalized w.r.t. different locality tier taxonomies. Previously this function had a special case that checked for, and only for, a specific locality tier with the key "region" and if it was found, picked a random instance from the subset of instances where their value for that matched the value for the KV node. Matching on and only on the "region" tier is both too specific and not specific enough: it is "too specific" in that it requires a tier with the key "region" to be used and to match, and is "not specific enough" in that it simultaneously ignores more specific locality tiers that would indicate closer matches (e.g. subregion, AZ, data-center or rack). Instead, this change generalizes this function to identify the subset of instances that have the "closest match" in localities to the KV node and pick one of them, where closest match is defined as the longest matching prefix of locality tiers. In a simple, single-tier locality taxonomy using the key "region" this should yield the same behavior as the previous implementation, as all instances with a matching "region" will have the same longest matching prefix (at length 1), however this more general approach should better handle other locality taxonomies that use more tiers and/or tiers with names other than "region". Currently this change only applies to physical planning for secondary tenants until physical planning is unified for system and secondary tenants. Release note: none. Epic: CRDB-16910 98471: changefeedccl: fix kafka messagetoolarge test failure r=samiskin a=samiskin Fixes: #93847 This fixes the following bug in the TestChangefeedKafkaMessageTooLarge test setup: 1. The feed starts sending messages, randomly triggering a MessageTooLarge error causing a retry with a smaller batch size 2. Eventually, while the retrying process is still ongoing, all 2000 rows are successfully received by the mock kafka sink, causing assertPayloads to complete, causing the test to closeFeed and run CANCEL on the changefeed. 3. The retrying process gets stuck in sendMessage where it can't send the message to the feedCh which has been closed since the changefeed is trying to close, but it also can't exit on the mock sink's tg.done since that only closes after the feed fully closes, which requires the retrying process to end. Release note: None Co-authored-by: Eric Harmeling <[email protected]> Co-authored-by: irfan sharif <[email protected]> Co-authored-by: David Taylor <[email protected]> Co-authored-by: Shiranka Miskin <[email protected]>

...subscribed to span configs. Do the same for the store rebalancer. We applied this treatment for the merge queue back in cockroachdb#78122 since the fallback behavior, if not subscribed, is to use the statically defined span config for every operation. - For the replicate queue this mean obtusely applying a replication factor of 3, regardless of configuration. This was possible typically post node restart before subscription was initially established. We saw this in cockroachdb#98385. It was possible then for us to ignore configured voter/non-voter/lease constraints. - For the split queue, we wouldn't actually compute any split keys if unsubscribed, so the missing check was somewhat benign. But we would be obtusely applying the default range sizes [128MiB,512MiB], so for clusters configured with larger range sizes, this could lead to a burst of splitting post node-restart. - For the MVCC GC queue, it would mean applying the the statically configured default GC TTL and ignoring any set protected timestamps. The latter is best-effort protection but could result in internal operations relying on protection (like backups, changefeeds) failing informatively. For clusters configured with GC TTL greater than the default, post node-restart it could lead to a burst of MVCC GC activity and AOST queries failing to find expected data. - For the store rebalancer, ignoring span configs could result in violating lease preferences and voter/non-voter constraints. Fixes cockroachdb#98421. Fixes cockroachdb#98385. While here, we also introduce the following non-public cluster settings to selectively enable/disable KV queues: - kv.mvcc_gc_queue.enabled - kv.split_queue.enabled - kv.replicate_queue.enabled Release note (bug fix): It was previously possible for CockroachDB to not respect non-default zone configs. This could only happen for a short window after nodes with existing replicas were restarted (measured in seconds), and self-rectified (also within seconds). This manifested in a few ways: - If num_replicas was set to something other than 3, we would still add or remove replicas to get to 3x replication. - If num_voters was set explicitly to get a mix of voting and non-voting replicas, it would be ignored. CockroachDB could possibly remove non-voting replicas. - If range_min_bytes or range_max_bytes were changed from 128 MiB and 512 MiB respectively, we would instead try to size ranges to be within [128 MiB, 512MiB]. This could appear as an excess amount of range splits or merges, as visible in the Replication Dashboard under "Range Operations". - If gc.ttlseconds was set to something other than 90000 seconds, we would still GC data only older than 90000s/25h. If the GC TTL was set to something larger than 25h, AOST queries going further back may now start failing. For GC TTLs less than the 25h default, clusters would observe increased disk usage due to more retained garbage. - If constraints, lease_preferences or voter_constraints were set, they would be ignored. Range data and leases would possibly be moved outside where prescribed. This issues lasted a few seconds post node-restarts, and any zone config violations were rectified shortly after.

cockroach-teamcity added this to the 23.1 milestone Mar 10, 2023

blathers-crl bot added the T-kv KV Team label Mar 10, 2023

nvanbenschoten assigned kvoli Mar 10, 2023

kvoli added GA-blocker and removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Mar 10, 2023

kvoli added release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. and removed GA-blocker labels Mar 10, 2023

irfansharif assigned irfansharif and unassigned kvoli Mar 10, 2023

irfansharif removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Mar 10, 2023

kvoli mentioned this issue Mar 10, 2023

kvserver: default span config is used temporarily on startup #98421

Closed

irfansharif mentioned this issue Mar 11, 2023

kvserver: disable {split,replicate,mvccGC} queues until... #98422

Merged

erikgrinaker added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-kv-distribution Relating to rebalancing and leasing. labels Mar 11, 2023

nvanbenschoten added the GA-blocker label Mar 13, 2023

craig bot closed this as completed in 234bcd0 Mar 13, 2023

irfansharif mentioned this issue Mar 16, 2023

release-22.2: kvserver: disable {split,replicate,mvccGC} queues until... #98803

Merged

kvoli mentioned this issue Mar 21, 2023

roachtest: replicate/wide failed #98945

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

roachtest: replicate/wide failed #98385

roachtest: replicate/wide failed #98385

cockroach-teamcity commented Mar 10, 2023 •

edited by cockroach-jira-scripts

Loading

kvoli commented Mar 10, 2023

kvoli commented Mar 10, 2023 •

edited

Loading

irfansharif commented Mar 10, 2023

kvoli commented Mar 10, 2023

cockroach-teamcity commented Mar 13, 2023

nvanbenschoten commented Mar 13, 2023

roachtest: replicate/wide failed #98385

roachtest: replicate/wide failed #98385

Comments

cockroach-teamcity commented Mar 10, 2023 • edited by cockroach-jira-scripts Loading

kvoli commented Mar 10, 2023

kvoli commented Mar 10, 2023 • edited Loading

irfansharif commented Mar 10, 2023

kvoli commented Mar 10, 2023

cockroach-teamcity commented Mar 13, 2023

nvanbenschoten commented Mar 13, 2023

cockroach-teamcity commented Mar 10, 2023 •

edited by cockroach-jira-scripts

Loading

kvoli commented Mar 10, 2023 •

edited

Loading