kv/kvnemesis: TestKVNemesisMultiNode failed #101721

cockroach-teamcity · 2023-04-18T10:47:25Z

kv/kvnemesis.TestKVNemesisMultiNode failed with artifacts on master @ 1cd507a7c6582fd91bb56123e62bd4fde45c4d22:

=== RUN   TestKVNemesisMultiNode
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode1865204116
    test_log_scope.go:79: use -show-logs to present logs inline
    kvnemesis_test.go:180: seed: 7605291541804074861
    kvnemesis_test.go:124: kvnemesis logging to /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis2082477957
    kvnemesis.go:165: range 65 (/Table/100/"00532edf314db371") RANGE_CONSISTENT_STATS_INCORRECT:
        stats: {ContainsEstimates:0 LastUpdateNanos:1681814839621095004 IntentAge:0 GCBytesAge:13445 LiveBytes:212 LiveCount:4 KeyBytes:435 KeyCount:11 ValBytes:290 ValCount:17 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:1 RangeKeyBytes:51 RangeValCount:1 RangeValBytes:10 SysBytes:10401 SysCount:20 AbortSpanBytes:1199}
        delta (stats-computed): {ContainsEstimates:0 LastUpdateNanos:1681814839621095004 IntentAge:0 GCBytesAge:0 LiveBytes:0 LiveCount:0 KeyBytes:0 KeyCount:0 ValBytes:0 ValCount:0 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:0 RangeKeyBytes:0 RangeValCount:0 RangeValBytes:0 SysBytes:-6 SysCount:0 AbortSpanBytes:0}
    kvnemesis.go:165: range 78 (/Table/100/"0d1b2742d6b6062f") RANGE_CONSISTENT_STATS_INCORRECT:
        stats: {ContainsEstimates:0 LastUpdateNanos:1681814839812403396 IntentAge:0 GCBytesAge:41393 LiveBytes:53 LiveCount:1 KeyBytes:750 KeyCount:14 ValBytes:583 ValCount:38 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:1 RangeKeyBytes:51 RangeValCount:1 RangeValBytes:10 SysBytes:2488 SysCount:21 AbortSpanBytes:1199}
        delta (stats-computed): {ContainsEstimates:0 LastUpdateNanos:1681814839812403396 IntentAge:0 GCBytesAge:0 LiveBytes:0 LiveCount:0 KeyBytes:0 KeyCount:0 ValBytes:0 ValCount:0 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:0 RangeKeyBytes:0 RangeValCount:0 RangeValBytes:0 SysBytes:6 SysCount:0 AbortSpanBytes:0}
    kvnemesis.go:185: failures(verbose): /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis2082477957/failures
        repro steps: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis2082477957/repro.go
        rangefeed KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis2082477957/kvs-rangefeed.txt
        scan KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis2082477957/kvs-scan.txt
    kvnemesis_test.go:207: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:207
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:160
        	Error:      	Should be zero, but was 2
        	Test:       	TestKVNemesisMultiNode
        	Messages:   	kvnemesis detected failures
    panic.go:522: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode1865204116
--- FAIL: TestKVNemesisMultiNode (70.36s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

kv/kvnemesis: TestKVNemesisMultiNode failed #101048 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-22.2]
kv/kvnemesis: TestKVNemesisMultiNode failed #100417 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-23.1]

/cc @cockroachdb/kv _{This test on roachdash | Improve this report!

Jira issue: CRDB-27104
Epic CRDB-27234}

The text was updated successfully, but these errors were encountered:

erikgrinaker · 2023-04-18T10:59:21Z

This is reminiscent of #93896, where we saw SysBytes discrepancies caused by range merges racing with lease requests (fixed in #99017). However, in this case there is a symmetric 6-byte discrepancy between two ranges -- does this indicate a similar issue during range splits, where SysBytes is mis-attributed to the wrong range after the split?

cockroach-teamcity · 2023-04-19T10:18:09Z

kv/kvnemesis.TestKVNemesisMultiNode failed with artifacts on master @ a4ab6b6243f509e62993b4d81855f23470d9901a:

=== RUN   TestKVNemesisMultiNode
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode1884782417
    test_log_scope.go:79: use -show-logs to present logs inline
    kvnemesis_test.go:180: seed: 7148618387478836565
    kvnemesis_test.go:124: kvnemesis logging to /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1659314035
    kvnemesis.go:165: range 75 (/Table/100/"bb01c544f45f5f14") RANGE_CONSISTENT_STATS_INCORRECT:
        stats: {ContainsEstimates:0 LastUpdateNanos:1681899478668119526 IntentAge:0 GCBytesAge:73101 LiveBytes:53 LiveCount:1 KeyBytes:693 KeyCount:13 ValBytes:559 ValCount:35 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:3 RangeKeyBytes:171 RangeValCount:5 RangeValBytes:51 SysBytes:4683 SysCount:18 AbortSpanBytes:941}
        delta (stats-computed): {ContainsEstimates:0 LastUpdateNanos:1681899478668119526 IntentAge:0 GCBytesAge:0 LiveBytes:0 LiveCount:0 KeyBytes:0 KeyCount:0 ValBytes:0 ValCount:0 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:0 RangeKeyBytes:0 RangeValCount:0 RangeValBytes:0 SysBytes:6 SysCount:0 AbortSpanBytes:0}
    kvnemesis.go:185: failures(verbose): /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1659314035/failures
        repro steps: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1659314035/repro.go
        rangefeed KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1659314035/kvs-rangefeed.txt
        scan KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1659314035/kvs-scan.txt
    kvnemesis_test.go:207: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:207
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:160
        	Error:      	Should be zero, but was 1
        	Test:       	TestKVNemesisMultiNode
        	Messages:   	kvnemesis detected failures
    panic.go:522: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode1884782417
--- FAIL: TestKVNemesisMultiNode (109.79s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

kv/kvnemesis: TestKVNemesisMultiNode failed #101048 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-22.2]
kv/kvnemesis: TestKVNemesisMultiNode failed #100417 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-23.1]

_{This test on roachdash | Improve this report!}

erikgrinaker · 2023-04-19T10:23:14Z

Both of these failures are with deadlock and kv.expiration_leases_only.enabled = true. This probably tickles the stats race because leases are more likely to expire, so the merge/split requests will trigger a concurrent lease acquisition.

blathers-crl · 2023-04-19T10:23:25Z

cc @cockroachdb/replication

cockroach-teamcity · 2023-04-22T10:03:08Z

kv/kvnemesis.TestKVNemesisMultiNode failed with artifacts on master @ 10665f9856d9c7ca51557bdf5b48fe9df796228b:

=== RUN   TestKVNemesisMultiNode
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode1227749579
    test_log_scope.go:79: use -show-logs to present logs inline
    kvnemesis_test.go:180: seed: 7800460224026642564
    kvnemesis_test.go:124: kvnemesis logging to /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3322945106
    kvnemesis.go:165: range 73 (/Table/100/"b155b47704015b5a") RANGE_CONSISTENT_STATS_INCORRECT:
        stats: {ContainsEstimates:0 LastUpdateNanos:1682157779371627510 IntentAge:0 GCBytesAge:76056 LiveBytes:53 LiveCount:1 KeyBytes:1173 KeyCount:25 ValBytes:839 ValCount:54 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:4 RangeKeyBytes:222 RangeValCount:6 RangeValBytes:62 SysBytes:3593 SysCount:14 AbortSpanBytes:598}
        delta (stats-computed): {ContainsEstimates:0 LastUpdateNanos:1682157779371627510 IntentAge:0 GCBytesAge:0 LiveBytes:0 LiveCount:0 KeyBytes:0 KeyCount:0 ValBytes:0 ValCount:0 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:0 RangeKeyBytes:0 RangeValCount:0 RangeValBytes:0 SysBytes:18 SysCount:0 AbortSpanBytes:0}
    kvnemesis.go:165: range 106 (/Table/100/"557a2358987d8669") RANGE_CONSISTENT_STATS_INCORRECT:
        stats: {ContainsEstimates:0 LastUpdateNanos:1682157779926807589 IntentAge:0 GCBytesAge:46609 LiveBytes:53 LiveCount:1 KeyBytes:519 KeyCount:11 ValBytes:384 ValCount:24 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:2 RangeKeyBytes:137 RangeValCount:5 RangeValBytes:51 SysBytes:1076 SysCount:14 AbortSpanBytes:683}
        delta (stats-computed): {ContainsEstimates:0 LastUpdateNanos:1682157779926807589 IntentAge:0 GCBytesAge:0 LiveBytes:0 LiveCount:0 KeyBytes:0 KeyCount:0 ValBytes:0 ValCount:0 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:0 RangeKeyBytes:0 RangeValCount:0 RangeValBytes:0 SysBytes:-18 SysCount:0 AbortSpanBytes:0}
    kvnemesis.go:185: failures(verbose): /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3322945106/failures
        repro steps: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3322945106/repro.go
        rangefeed KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3322945106/kvs-rangefeed.txt
        scan KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3322945106/kvs-scan.txt
    kvnemesis_test.go:207: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:207
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:160
        	Error:      	Should be zero, but was 2
        	Test:       	TestKVNemesisMultiNode
        	Messages:   	kvnemesis detected failures
    panic.go:522: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode1227749579
--- FAIL: TestKVNemesisMultiNode (119.19s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

kv/kvnemesis: TestKVNemesisMultiNode failed #101048 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-22.2]
kv/kvnemesis: TestKVNemesisMultiNode failed #100417 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-23.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2023-04-23T11:17:54Z

kv/kvnemesis.TestKVNemesisMultiNode failed with artifacts on master @ 1f3419e178bdba544f74d9c9e14a4682efd18028:

=== RUN   TestKVNemesisMultiNode
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode1377543860
    test_log_scope.go:79: use -show-logs to present logs inline
    kvnemesis_test.go:180: seed: 1024415409345674575
    kvnemesis_test.go:124: kvnemesis logging to /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis752674863
    kvnemesis.go:165: range 115 (/Table/100/"4f372447fddf9089") RANGE_CONSISTENT_STATS_INCORRECT:
        stats: {ContainsEstimates:0 LastUpdateNanos:1682248667918326218 IntentAge:0 GCBytesAge:39780 LiveBytes:0 LiveCount:0 KeyBytes:624 KeyCount:12 ValBytes:490 ValCount:31 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:1 RangeKeyBytes:51 RangeValCount:1 RangeValBytes:10 SysBytes:3401 SysCount:21 AbortSpanBytes:1293}
        delta (stats-computed): {ContainsEstimates:0 LastUpdateNanos:1682248667918326218 IntentAge:0 GCBytesAge:0 LiveBytes:0 LiveCount:0 KeyBytes:0 KeyCount:0 ValBytes:0 ValCount:0 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:0 RangeKeyBytes:0 RangeValCount:0 RangeValBytes:0 SysBytes:-6 SysCount:0 AbortSpanBytes:0}
    kvnemesis.go:185: failures(verbose): /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis752674863/failures
        repro steps: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis752674863/repro.go
        rangefeed KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis752674863/kvs-rangefeed.txt
        scan KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis752674863/kvs-scan.txt
    kvnemesis_test.go:207: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:207
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:160
        	Error:      	Should be zero, but was 1
        	Test:       	TestKVNemesisMultiNode
        	Messages:   	kvnemesis detected failures
    panic.go:522: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode1377543860
--- FAIL: TestKVNemesisMultiNode (96.03s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

kv/kvnemesis: TestKVNemesisMultiNode failed #101048 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-22.2]
kv/kvnemesis: TestKVNemesisMultiNode failed #100417 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-23.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2023-04-24T09:50:10Z

kv/kvnemesis.TestKVNemesisMultiNode failed with artifacts on master @ 1f3419e178bdba544f74d9c9e14a4682efd18028:

=== RUN   TestKVNemesisMultiNode
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode4065196892
    test_log_scope.go:79: use -show-logs to present logs inline
    kvnemesis_test.go:180: seed: 7646607177314613521
    kvnemesis_test.go:124: kvnemesis logging to /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3681986409
    kvnemesis.go:165: range 147 (/Table/100/"7bc4d1e4075c7520") RANGE_CONSISTENT_STATS_INCORRECT:
        stats: {ContainsEstimates:0 LastUpdateNanos:1682329766680715445 IntentAge:0 GCBytesAge:34428 LiveBytes:0 LiveCount:0 KeyBytes:285 KeyCount:5 ValBytes:243 ValCount:15 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:0 RangeKeyBytes:0 RangeValCount:0 RangeValBytes:0 SysBytes:1169 SysCount:10 AbortSpanBytes:420}
        delta (stats-computed): {ContainsEstimates:0 LastUpdateNanos:1682329766680715445 IntentAge:0 GCBytesAge:0 LiveBytes:0 LiveCount:0 KeyBytes:0 KeyCount:0 ValBytes:0 ValCount:0 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:0 RangeKeyBytes:0 RangeValCount:0 RangeValBytes:0 SysBytes:6 SysCount:0 AbortSpanBytes:0}
    kvnemesis.go:185: failures(verbose): /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3681986409/failures
        repro steps: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3681986409/repro.go
        rangefeed KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3681986409/kvs-rangefeed.txt
        scan KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3681986409/kvs-scan.txt
    kvnemesis_test.go:207: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:207
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:160
        	Error:      	Should be zero, but was 1
        	Test:       	TestKVNemesisMultiNode
        	Messages:   	kvnemesis detected failures
    panic.go:522: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode4065196892
--- FAIL: TestKVNemesisMultiNode (172.17s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

kv/kvnemesis: TestKVNemesisMultiNode failed #101048 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-22.2]
kv/kvnemesis: TestKVNemesisMultiNode failed #100417 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-23.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2023-04-26T09:59:44Z

kv/kvnemesis.TestKVNemesisMultiNode failed with artifacts on master @ 9fb03ee9b365d656cc26728cc87f682a385d511b:

=== RUN   TestKVNemesisMultiNode
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode2090673101
    test_log_scope.go:79: use -show-logs to present logs inline
    kvnemesis_test.go:180: seed: 753312517745099591
    kvnemesis_test.go:124: kvnemesis logging to /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1232282953
    kvnemesis.go:165: error applying x.AdminMerge(ctx, tk(6314721792752128086)) // kv/kvserver/replica_command.go:808: merge failed: waiting for all right-hand replicas to initialize: operation "wait for replicas init" timed out after 5.002s (given timeout 5s): grpc: context deadline exceeded [code 4/DeadlineExceeded]: kv/kvserver/replica_command.go:808: merge failed: waiting for all right-hand replicas to initialize: operation "wait for replicas init" timed out after 5.002s (given timeout 5s): grpc: context deadline exceeded [code 4/DeadlineExceeded]
    kvnemesis.go:185: failures(verbose): /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1232282953/failures
        repro steps: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1232282953/repro.go
        rangefeed KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1232282953/kvs-rangefeed.txt
        scan KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1232282953/kvs-scan.txt
    kvnemesis_test.go:207: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:207
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:160
        	Error:      	Should be zero, but was 1
        	Test:       	TestKVNemesisMultiNode
        	Messages:   	kvnemesis detected failures
    panic.go:522: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode2090673101
--- FAIL: TestKVNemesisMultiNode (19.84s)

Parameters: TAGS=bazel,gss

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

kv/kvnemesis: TestKVNemesisMultiNode failed #101048 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-22.2]
kv/kvnemesis: TestKVNemesisMultiNode failed #100417 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-23.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2023-05-03T10:38:11Z

kv/kvnemesis.TestKVNemesisMultiNode failed with artifacts on master @ 9e0b06bf61700d5192f171c6ebaec7e8a10fa6e6:

=== RUN   TestKVNemesisMultiNode
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode2226663341
    test_log_scope.go:79: use -show-logs to present logs inline
    kvnemesis_test.go:180: seed: 8827703835256234507
    kvnemesis_test.go:124: kvnemesis logging to /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis190548753
    kvnemesis.go:165: range 66 (/Table/100/"43181f1ddecdf899") RANGE_CONSISTENT_STATS_INCORRECT:
        stats: {ContainsEstimates:0 LastUpdateNanos:1683110290381382106 IntentAge:0 GCBytesAge:12442 LiveBytes:0 LiveCount:0 KeyBytes:1281 KeyCount:25 ValBytes:975 ValCount:63 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:2 RangeKeyBytes:115 RangeValCount:3 RangeValBytes:31 SysBytes:4629 SysCount:24 AbortSpanBytes:1532}
        delta (stats-computed): {ContainsEstimates:0 LastUpdateNanos:1683110290381382106 IntentAge:0 GCBytesAge:0 LiveBytes:0 LiveCount:0 KeyBytes:0 KeyCount:0 ValBytes:0 ValCount:0 IntentBytes:0 IntentCount:0 SeparatedIntentCount:0 RangeKeyCount:0 RangeKeyBytes:0 RangeValCount:0 RangeValBytes:0 SysBytes:-10 SysCount:0 AbortSpanBytes:0}
    kvnemesis.go:185: failures(verbose): /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis190548753/failures
        repro steps: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis190548753/repro.go
        rangefeed KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis190548753/kvs-rangefeed.txt
        scan KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis190548753/kvs-scan.txt
    kvnemesis_test.go:207: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:207
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:160
        	Error:      	Should be zero, but was 1
        	Test:       	TestKVNemesisMultiNode
        	Messages:   	kvnemesis detected failures
    panic.go:522: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode2226663341
--- FAIL: TestKVNemesisMultiNode (16.69s)

Parameters: TAGS=bazel,gss

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

kv/kvnemesis: TestKVNemesisMultiNode failed #102411 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv X-duplicate branch-release-23.1.0]
kv/kvnemesis: TestKVNemesisMultiNode failed #101048 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-22.2]
kv/kvnemesis: TestKVNemesisMultiNode failed #100417 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-23.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2023-05-17T08:04:48Z

kv/kvnemesis.TestKVNemesisMultiNode failed with artifacts on master @ 69875da00208dd792cb92aee54811e8a660773a9:

=== RUN   TestKVNemesisMultiNode
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode2895614279
    test_log_scope.go:79: use -show-logs to present logs inline
    kvnemesis_test.go:180: seed: 2840821478841235807
    kvnemesis_test.go:124: kvnemesis logging to /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1782552098
    kvnemesis.go:165: error applying x.Txn(ctx, func(ctx context.Context, txn *kv.Txn) error {
          txn.SetIsoLevel(isolation.Snapshot)
          txn.ReverseScan(ctx, tk(2472807650999776716), tk(15928832316702100637), 0) // kv/kvclient/kvcoord/txn_coord_sender.go:754: unexpected txn state: "unnamed" meta={id=40ecf13e key=/Table/100/"08ce982a34f906c6" iso=Snapshot pri=0.02441066 epo=1 ts=1684310680.015681917,0 min=1684310679.797745821,0 seq=0} lock=true stat=STAGING rts=1684310680.015681917,0 wto=false gul=1684310679.807745821,0; heartbeat observed status: PENDING
          txn.Get(ctx, tk(9210135055451966804)) // omitted
          txn.Put(ctx, tk(634611904550602438), sv(495)) // omitted
          b := &kv.Batch{}
          b.Put(tk(3897933467779442998), sv(496)) // TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE): "unnamed" meta={id=40ecf13e key=/Table/100/"08ce982a34f906c6" iso=Snapshot pri=0.02441066 epo=0 ts=1684310679.905093068,1 min=1684310679.797745821,0 seq=3} lock=true stat=STAGING rts=1684310679.807745821,0 wto=false gul=1684310679.807745821,0 ifw=2
          txn.CommitInBatch(ctx, b) // TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE): "unnamed" meta={id=40ecf13e key=/Table/100/"08ce982a34f906c6" iso=Snapshot pri=0.02441066 epo=0 ts=1684310679.905093068,1 min=1684310679.797745821,0 seq=3} lock=true stat=STAGING rts=1684310679.807745821,0 wto=false gul=1684310679.807745821,0 ifw=2
          return nil
        }) // kv/kvclient/kvcoord/txn_coord_sender.go:754: unexpected txn state: "unnamed" meta={id=40ecf13e key=/Table/100/"08ce982a34f906c6" iso=Snapshot pri=0.02441066 epo=1 ts=1684310680.015681917,0 min=1684310679.797745821,0 seq=0} lock=true stat=STAGING rts=1684310680.015681917,0 wto=false gul=1684310679.807745821,0; heartbeat observed status: PENDING: kv/kvclient/kvcoord/txn_coord_sender.go:754: unexpected txn state: "unnamed" meta={id=40ecf13e key=/Table/100/"08ce982a34f906c6" iso=Snapshot pri=0.02441066 epo=1 ts=1684310680.015681917,0 min=1684310679.797745821,0 seq=0} lock=true stat=STAGING rts=1684310680.015681917,0 wto=false gul=1684310679.807745821,0; heartbeat observed status: PENDING
    kvnemesis.go:165: error applying x.ReverseScan(ctx, tk(2472807650999776716), tk(15928832316702100637), 0) // kv/kvclient/kvcoord/txn_coord_sender.go:754: unexpected txn state: "unnamed" meta={id=40ecf13e key=/Table/100/"08ce982a34f906c6" iso=Snapshot pri=0.02441066 epo=1 ts=1684310680.015681917,0 min=1684310679.797745821,0 seq=0} lock=true stat=STAGING rts=1684310680.015681917,0 wto=false gul=1684310679.807745821,0; heartbeat observed status: PENDING: kv/kvclient/kvcoord/txn_coord_sender.go:754: unexpected txn state: "unnamed" meta={id=40ecf13e key=/Table/100/"08ce982a34f906c6" iso=Snapshot pri=0.02441066 epo=1 ts=1684310680.015681917,0 min=1684310679.797745821,0 seq=0} lock=true stat=STAGING rts=1684310680.015681917,0 wto=false gul=1684310679.807745821,0; heartbeat observed status: PENDING
    kvnemesis.go:185: failures(verbose): /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1782552098/failures
        repro steps: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1782552098/repro.go
        rangefeed KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1782552098/kvs-rangefeed.txt
        scan KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1782552098/kvs-scan.txt
    kvnemesis_test.go:207: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:207
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:160
        	Error:      	Should be zero, but was 2
        	Test:       	TestKVNemesisMultiNode
        	Messages:   	kvnemesis detected failures
    panic.go:522: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode2895614279
--- FAIL: TestKVNemesisMultiNode (28.67s)

Parameters: TAGS=bazel,gss

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

kv/kvnemesis: TestKVNemesisMultiNode failed #102411 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv X-duplicate todo-deprecate.branch-release-23.1.0]
kv/kvnemesis: TestKVNemesisMultiNode failed #101048 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-22.2 no-test-failure-activity]
kv/kvnemesis: TestKVNemesisMultiNode failed #100417 kv/kvnemesis: TestKVNemesisMultiNode failed [C-bug C-test-failure O-robot T-kv branch-release-23.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2023-05-17T08:07:53Z

kv/kvnemesis.TestKVNemesisMultiNode failed with artifacts on master @ 69875da00208dd792cb92aee54811e8a660773a9:

=== RUN   TestKVNemesisMultiNode
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode3901150746
    test_log_scope.go:79: use -show-logs to present logs inline
    kvnemesis_test.go:180: seed: 3837436284393373776
    kvnemesis_test.go:124: kvnemesis logging to /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3813137540
    kvnemesis.go:165: error applying x.Txn(ctx, func(ctx context.Context, txn *kv.Txn) error {
          txn.SetIsoLevel(isolation.Snapshot)
          {
            b := &kv.Batch{}
            txn.Run(ctx, b) // <nil>
          }
          txn.Scan(ctx, tk(1698754368228448779), tk(12972774191588256712), 0) // kv/kvclient/kvcoord/txn_coord_sender.go:754: unexpected txn state: "unnamed" meta={id=32117260 key=/Table/100/"f50e741697ca5f12" iso=Snapshot pri=0.01451727 epo=1 ts=1684310860.032580778,0 min=1684310859.418820247,0 seq=0} lock=true stat=STAGING rts=1684310860.032580778,0 wto=false gul=1684310859.428820247,0; heartbeat observed status: PENDING
          txn.Scan(ctx, tk(8721585885451996246), tk(11377696540311594206), 0) // omitted
          b := &kv.Batch{}
          b.Del(tk(17658178829351018258) /* @s545 */) // TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE): "unnamed" meta={id=32117260 key=/Table/100/"f50e741697ca5f12" iso=Snapshot pri=0.01451727 epo=0 ts=1684310859.578805914,1 min=1684310859.418820247,0 seq=3} lock=true stat=STAGING rts=1684310859.418820247,0 wto=false gul=1684310859.428820247,0 int=1 ifw=1
          b.Del(tk(9373485187807277138) /* @s546 */) // TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE): "unnamed" meta={id=32117260 key=/Table/100/"f50e741697ca5f12" iso=Snapshot pri=0.01451727 epo=0 ts=1684310859.578805914,1 min=1684310859.418820247,0 seq=3} lock=true stat=STAGING rts=1684310859.418820247,0 wto=false gul=1684310859.428820247,0 int=1 ifw=1
          txn.CommitInBatch(ctx, b) // TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE): "unnamed" meta={id=32117260 key=/Table/100/"f50e741697ca5f12" iso=Snapshot pri=0.01451727 epo=0 ts=1684310859.578805914,1 min=1684310859.418820247,0 seq=3} lock=true stat=STAGING rts=1684310859.418820247,0 wto=false gul=1684310859.428820247,0 int=1 ifw=1
          return nil
        }) // kv/kvclient/kvcoord/txn_coord_sender.go:754: unexpected txn state: "unnamed" meta={id=32117260 key=/Table/100/"f50e741697ca5f12" iso=Snapshot pri=0.01451727 epo=1 ts=1684310860.032580778,0 min=1684310859.418820247,0 seq=0} lock=true stat=STAGING rts=1684310860.032580778,0 wto=false gul=1684310859.428820247,0; heartbeat observed status: PENDING: kv/kvclient/kvcoord/txn_coord_sender.go:754: unexpected txn state: "unnamed" meta={id=32117260 key=/Table/100/"f50e741697ca5f12" iso=Snapshot pri=0.01451727 epo=1 ts=1684310860.032580778,0 min=1684310859.418820247,0 seq=0} lock=true stat=STAGING rts=1684310860.032580778,0 wto=false gul=1684310859.428820247,0; heartbeat observed status: PENDING
    kvnemesis.go:165: error applying x.Scan(ctx, tk(1698754368228448779), tk(12972774191588256712), 0) // kv/kvclient/kvcoord/txn_coord_sender.go:754: unexpected txn state: "unnamed" meta={id=32117260 key=/Table/100/"f50e741697ca5f12" iso=Snapshot pri=0.01451727 epo=1 ts=1684310860.032580778,0 min=1684310859.418820247,0 seq=0} lock=true stat=STAGING rts=1684310860.032580778,0 wto=false gul=1684310859.428820247,0; heartbeat observed status: PENDING: kv/kvclient/kvcoord/txn_coord_sender.go:754: unexpected txn state: "unnamed" meta={id=32117260 key=/Table/100/"f50e741697ca5f12" iso=Snapshot pri=0.01451727 epo=1 ts=1684310860.032580778,0 min=1684310859.418820247,0 seq=0} lock=true stat=STAGING rts=1684310860.032580778,0 wto=false gul=1684310859.428820247,0; heartbeat observed status: PENDING
    kvnemesis.go:185: failures(verbose): /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3813137540/failures
        repro steps: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3813137540/repro.go
        rangefeed KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3813137540/kvs-rangefeed.txt
        scan KVs: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis3813137540/kvs-scan.txt
    kvnemesis_test.go:207: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:207
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:160
        	Error:      	Should be zero, but was 2
        	Test:       	TestKVNemesisMultiNode
        	Messages:   	kvnemesis detected failures
    panic.go:522: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode3901150746
--- FAIL: TestKVNemesisMultiNode (104.84s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

Same failure on other branches

kv/kvnemesis: TestKVNemesisMultiNode failed #102411 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv X-duplicate todo-deprecate.branch-release-23.1.0]
kv/kvnemesis: TestKVNemesisMultiNode failed #101048 kv/kvnemesis: TestKVNemesisMultiNode failed [C-test-failure O-robot T-kv branch-release-22.2 no-test-failure-activity]
kv/kvnemesis: TestKVNemesisMultiNode failed #100417 kv/kvnemesis: TestKVNemesisMultiNode failed [C-bug C-test-failure O-robot T-kv branch-release-23.1]

_{This test on roachdash | Improve this report!}

aliher1911 · 2023-05-22T12:17:46Z

My current hypothesis is that the discrepancy comes from the fact that lease extensions doesn't obtain latches and could could change lease and mvcc stats in parallel to other requests. The only requests that touches lease are likely range split and range merge which are done as a commit trigger when executing EndTxnRequest. There was a fix for merge op already improving that and I think current failure is happening during splits.

Current repro of the failure shows discrepancy of 6 bytes on SystemBytes stat. This always happen when there's a lease extension happening in parallel to split. This change itself is mostly harmless as size of lease value shouldn't change, but it changes if we have a local component to the timestamp. Leases that have local component on commit and expiration timestamps are 6 bytes larger than without and that matches the size difference. Moreover failure always happens when lease extension happening close to the split changes from using local to local == 0. As an experiment, resetting local component to 0 makes failure go away. We can't use it as a fix as it is not correct and could cause problems.

Current understanding of the cause:

cockroach/pkg/kv/kvserver/batcheval/split_stats_helper.go

Lines 148 to 176 in b34c137

    
           // Scan to compute the stats for the first side. 
        
           var absPostSplitFirst enginepb.MVCCStats 
        
           var err error 
        
           if h.in.ScanRightFirst { 
        
           	absPostSplitFirst, err = input.AbsPostSplitRightFn() 
        
           	h.absPostSplitRight = &absPostSplitFirst 
        
           } else { 
        
           	absPostSplitFirst, err = input.AbsPostSplitLeftFn() 
        
           	h.absPostSplitLeft = &absPostSplitFirst 
        
           } 
        
           if err != nil { 
        
           	return splitStatsHelper{}, err 
        
           } 
        
           if h.in.AbsPreSplitBothEstimated.ContainsEstimates == 0 && 
        
           	h.in.DeltaBatchEstimated.ContainsEstimates == 0 { 
        
           	// We have CombinedErrorDelta zero, so use arithmetic to compute the 
        
           	// stats for the second side. 
        
           	ms := h.in.AbsPreSplitBothEstimated 
        
           	ms.Subtract(absPostSplitFirst) 
        
           	ms.Add(h.in.DeltaBatchEstimated) 
        
           	ms.Add(h.in.DeltaRangeKey) 
        
           	if h.in.ScanRightFirst { 
        
           		h.absPostSplitLeft = &ms 
        
           	} else { 
        
           		h.absPostSplitRight = &ms 
        
           	} 
        
           	return h, nil 
        
           }

h.in.AbsPreSplitBothEstimated contains stats from replica obtained by:

cockroach/pkg/kv/kvserver/replica.go

Lines 1302 to 1309 in b34c137

    
           // GetMVCCStats returns a copy of the MVCC stats object for this range. 
        
           // This accessor is thread-safe, but provides no guarantees about its 
        
           // synchronization with any concurrent writes. 
        
           func (r *Replica) GetMVCCStats() enginepb.MVCCStats { 
        
           	r.mu.RLock() 
        
           	defer r.mu.RUnlock() 
        
           	return *r.mu.state.Stats 
        
           }

which is an in-memory most recent state of stats.
We then compute(scan) LHS stats of snapshot and we end up with full stats that are inconsistent with LHS stats. We then compute RHS using full and LHS which refer different points in time that can have lease extension inbetween. That would produce skewed RHS.
We then use absPostSplitLeft in

cockroach/pkg/kv/kvserver/batcheval/split_stats_helper.go

Lines 212 to 220 in b34c137

    
           func (h splitStatsHelper) DeltaPostSplitLeft() enginepb.MVCCStats { 
        
           	// NB: if we ever wanted to also write to the left hand side after init'ing 
        
           	// the helper, we can make that work, too. 
        
           	// NB: note how none of this depends on mutations to absPostSplitRight. 
        
           	ms := *h.absPostSplitLeft 
        
           	ms.Subtract(h.in.AbsPreSplitBothEstimated) 
        
           	return ms 
        
           }

to produce final stats delta that we return as a result of execution. This would contain delta between split time snapshot and newer version of stats from replica.

Previouly split trigger was using in-memory copy of mvccstats from replica. This value could diverge from store snapshot used by request in special cases of lease extensions that don't hold latches. This was causing stats for the replica to diverge. This PR changes base stats to be read from request reader which provides consistent stats value with content of splitted ranges. Touches: cockroachdb#101721 Release note: None

aliher1911 · 2023-05-22T17:37:48Z

@nvanbenschoten can you look at #101721 (comment) if it looks familiar in any way? This issue has bunch of different failures. Top level ones come are for mvcc stats, but transaction commit might be connected with changes you merged recently.

Previouly split trigger was using in-memory copy of mvccstats from replica. This value could diverge from store snapshot used by request in special cases of lease extensions that don't hold latches. This was causing stats for the replica to diverge. This PR changes base stats to be read from request reader which provides consistent stats value with content of splitted ranges. Touches: cockroachdb#101721 Release note: None

103719: kvserver/batcheval: fix mvcc stats computaion on split r=nvanbenschoten a=aliher1911 Previouly split trigger was using in-memory copy of mvccstats from replica. This value could diverge from store snapshot used by request in special cases of lease extensions that don't hold latches. This was causing stats for the replica to diverge. This PR changes base stats to be read from request reader which provides consistent stats value with content of splitted ranges. Touches: #101721 Fixes: #102410 Release note: None Co-authored-by: Oleg Afanasyev <[email protected]>

aliher1911 · 2023-06-06T16:55:05Z

We havn't seen many failures after stats were fixed. I managed to repro transaction push timeout with 50k reruns while testing stats:

--- FAIL: TestKVNemesisMultiNode (13.16s)
    test_log_scope.go:161: test logs captured to: /tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode3315994629
    test_log_scope.go:79: use -show-logs to present logs inline
    kvnemesis_test.go:180: seed: 910877016883269571
    kvnemesis_test.go:124: kvnemesis logging to /tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1584584814
    kvnemesis.go:165: error applying x.Scan(ctx, tk(4919960733718109041), tk(8091360691588176074), 0) // failed to push meta={id=d01cdc27 key=/Table/100/"480aacd5586decfa" iso=Snapshot pri=0.06480878 epo=1 ts=1684954142.358901691,1 min=1684954142.358901691,0 seq=2} lock=true stat=PENDING rts=0,0 wto=false gul=0,0: {<nil> 0 {0xc003a2caf8} index:0 1684954142.383387488,0}
    kvnemesis.go:185: failures(verbose): /tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1584584814/failures
        repro steps: /tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1584584814/repro.go
        rangefeed KVs: /tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1584584814/kvs-rangefeed.txt
        scan KVs: /tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis1584584814/kvs-scan.txt
    kvnemesis_test.go:207:
                Error Trace:    github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:207
                                                        github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/kvnemesis_test.go:160
                Error:          Should be zero, but was 1
                Test:           TestKVNemesisMultiNode
                Messages:       kvnemesis detected failures
    panic.go:522: -- test log scope end --
FAIL
I230524 18:49:03.146045 1 (gostd) testmain.go:100  [-] 1  Test //pkg/kv/kvnemesis:kvnemesis_test exited with error code 1

aliher1911 · 2023-06-06T16:55:51Z

I'll put issue back into backlog.

cockroach-teamcity · 2023-07-31T10:05:39Z

kv/kvnemesis.TestKVNemesisMultiNode failed with artifacts on master @ e57e9742146b7fac7f7530fa6224e11121ada92f:

Fatal error:

panic: use of Span after Finish. Span: /cockroach.roachpb.Internal/Batch. Finish previously called at: <stack not captured. Set debugUseAfterFinish> [recovered]
	panic: use of Span after Finish. Span: /cockroach.roachpb.Internal/Batch. Finish previously called at: <stack not captured. Set debugUseAfterFinish>

Stack:

goroutine 7767151 [running]:
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).recover(0x1f8006a?, {0x69690a8, 0xc012f9d3b0})
	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:230 +0x6a
panic({0x4b26d80, 0xc0087e0730})
	GOROOT/src/runtime/panic.go:884 +0x212
github.com/cockroachdb/cockroach/pkg/util/tracing.(*Span).detectUseAfterFinish(0x69690a8?)
	github.com/cockroachdb/cockroach/pkg/util/tracing/span.go:182 +0x12e
github.com/cockroachdb/cockroach/pkg/util/tracing.(*Span).Tracer(...)
	github.com/cockroachdb/cockroach/pkg/util/tracing/span.go:225
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*replicaDecoder).createTracingSpans(0xc0066c7658, {0x69690a8, 0xc0029e5980})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_application_decoder.go:154 +0x4f0
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*replicaDecoder).DecodeAndBind(0xc0066c7200?, {0x69690a8, 0xc0029e5980}, {0xc0153d2750?, 0x2?, 0x10?})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_application_decoder.go:63 +0x66
github.com/cockroachdb/cockroach/pkg/kv/kvserver/apply.(*Task).Decode(...)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/apply/task.go:142
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReadyRaftMuLocked(_, {_, _}, {{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...}, ...})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:878 +0x73a
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReady(_, {_, _}, {{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...}, ...})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:728 +0x1b0
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).processReady(0xc006662000, 0x2?)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:689 +0x14b
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftSchedulerShard).worker(0xc00de44540, {0x69690a8, 0xc012f9d3b0}, {0x694eee0, 0xc006662000}, 0xc006581500)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:416 +0x274
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftScheduler).Start.func2({0x69690a8?, 0xc012f9d3b0?})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:321 +0x45
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:484 +0x146
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx
	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:475 +0x43b

Log preceding fatal error

=== RUN   TestKVNemesisMultiNode
    test_log_scope.go:167: test logs captured to: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestKVNemesisMultiNode2457498475
    test_log_scope.go:81: use -show-logs to present logs inline
    kvnemesis_test.go:283: seed: 8481106198570085994
    test_server_shim.go:98: 
        Test server was configured to route SQL queries to a secondary tenant (virtual cluster).
        If you are only seeing a test failure when this message appears, there may be a problem
        specific to cluster virtualization or multi-tenancy.
        
        To investigate, consider using "COCKROACH_TEST_TENANT=true" to force-enable just
        the secondary tenant in all runs (or, alternatively, "false" to force-disable), or use
        "COCKROACH_INTERNAL_DISABLE_METAMORPHIC_TESTING=false" to disable all random test variables altogether.
*
* WARNING: test tenant requested by configuration, but code organization prevents start!
* OSS binaries do not include enterprise features
*
*
* WARNING: test tenant requested by configuration, but code organization prevents start!
* OSS binaries do not include enterprise features
*
*
* WARNING: test tenant requested by configuration, but code organization prevents start!
* OSS binaries do not include enterprise features
*
*
* WARNING: test tenant requested by configuration, but code organization prevents start!
* OSS binaries do not include enterprise features
*
    kvnemesis_test.go:197: kvnemesis logging to /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/kvnemesis4034049211

Parameters: TAGS=bazel,gss,deadlock , stress=true

Help

See also: How To Investigate a Go Test Failure (internal)

_{This test on roachdash | Improve this report!}

erikgrinaker · 2023-07-31T14:12:08Z

The last span use after finish might be the same issue as #107242.

…iority DNM: needs more testing. Fixes cockroachdb#105330. Fixes cockroachdb#101721. This commit updates the transaction push logic to stop pushes from completing successfully when the pushee is STAGING and the pusher has equal priority. Prior to this change, the pusher would win in these cases when using a PUSH_TIMESTAMP if at least one of the two transactions involved used a weak isolation level. This had two undesirable effects: - if the pushee was still possibly committable and requiring recovery (`!(knownHigherTimestamp || knownHigherEpoch)` in the code) then the pusher would immediately begin parallel commit recovery, attempting to disrupt the commit and abort the pushee. This is undesirable because the pushee may still be in progress and in cases of equal priority, we'd like to wait for the parallel commit to complete before kicking off recovery, deferring recovery to only the cases where the pushee/committers's heartbeat has expired. - if the pushee was known too be uncommittable (`knownHigherTimestamp || knownHigherEpoch` in the code), then txn recovery was not kicked off but the pusher still could not perform the PUSH_TIMESTAMP (see e40c1b4), so it would return a `TransactionPushError`. This confused logic in `handleTransactionPushError`, allowing the error to escape to the client. This commit resolves both issues by considering the pushee's transaction status in `txnwait.ShouldPushImmediately`. Release note: None

Fixes cockroachdb#105330. Fixes cockroachdb#101721. This commit updates the transaction push logic to stop pushes from completing successfully when the pushee is STAGING and the pusher has equal priority. Prior to this change, the pusher would win in these cases when using a PUSH_TIMESTAMP if at least one of the two transactions involved used a weak isolation level. This had two undesirable effects: - if the pushee was still possibly committable and requiring recovery (`!(knownHigherTimestamp || knownHigherEpoch)` in the code) then the pusher would immediately begin parallel commit recovery, attempting to disrupt the commit and abort the pushee. This is undesirable because the pushee may still be in progress and in cases of equal priority, we'd like to wait for the parallel commit to complete before kicking off recovery, deferring recovery to only the cases where the pushee/committers's heartbeat has expired. - if the pushee was known too be uncommittable (`knownHigherTimestamp || knownHigherEpoch` in the code), then txn recovery was not kicked off but the pusher still could not perform the PUSH_TIMESTAMP (see e40c1b4), so it would return a `TransactionPushError`. This confused logic in `handleTransactionPushError`, allowing the error to escape to the client. This commit resolves both issues by considering the pushee's transaction status in `txnwait.ShouldPushImmediately`. Release note: None

107882: kv: don't let pusher win when pushing STAGING txn with equal priority r=miraradeva a=nvanbenschoten Fixes #105330. Fixes #101721. This commit updates the transaction push logic to stop pushes from completing successfully when the pushee is STAGING and the pusher has equal priority. Prior to this change, the pusher would win in these cases when using a PUSH_TIMESTAMP if at least one of the two transactions involved used a weak isolation level. This had two undesirable effects: - if the pushee was still possibly committable and requiring recovery (`!(knownHigherTimestamp || knownHigherEpoch)` in the code) then the pusher would immediately begin parallel commit recovery, attempting to disrupt the commit and abort the pushee. This is undesirable because the pushee may still be in progress and in cases of equal priority, we'd like to wait for the parallel commit to complete before kicking off recovery, deferring recovery to only the cases where the pushee/committers's heartbeat has expired. - if the pushee was known to be uncommittable (`knownHigherTimestamp || knownHigherEpoch` in the code), then txn recovery was not kicked off but the pusher still could not perform the PUSH_TIMESTAMP (see e40c1b4), so it would return a `TransactionPushError`. This confused logic in `handleTransactionPushError`, allowing the error to escape to the client. This commit resolves both issues by considering the pushee's transaction status in `txnwait.ShouldPushImmediately`. Release note: None 107962: docs: do not generate docs for crdb_internal functions r=knz a=rafiss Epic: None Release note: None 108062: server: Add rangefeed metrics r=miretskiy a=miretskiy Add rangefeed related metrics to keep track of the number of actively running rangefeeds on the server. Epic: CRDB-26372 Release note: None 108071: stmtdiagnostics: use background context when building the bundle r=yuzefovich a=yuzefovich When the context is canceled, we still want to build the bundle as best as possible. Over in 532274b we introduced the usage of the background context in order to insert the bundle into the system tables, but we still built the bundle with the canceled context. This commit fixes that oversight - in particular, we should now get `env.sql` correctly. Informs: https://github.com/cockroachlabs/support/issues/2494. Epic: None Release note: None 108085: builtins: improve docs and refactor code for unordered_unique_id r=rafiss,rharding6373 a=andyyang890 **builtins: improve documentation for unordered_unique_rowid** This patch clarifies that `unordered_unique_rowid` generates unique IDs that are statistically likely to be unordered because it bit-reverses the insert timestamp for use as the ID's timestamp bit segment. Release note: None ---- **builtinconstants: define constants for unique int bit segments** This patch defines constants for the sizes and bitmasks for each bit segment in a unique int generated for row IDs. Release note: None ---- **builtins: refactor bit shifting logic in mapToUnorderedUniqueInt** This patch refactors the bit shifting logic in `mapToUnorderedUniqueInt` to use constants instead of magic numbers. Release note: None ---- Epic: None Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Yevgeniy Miretskiy <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Andy Yang <[email protected]>

cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels Apr 18, 2023

cockroach-teamcity added this to the 23.1 milestone Apr 18, 2023

blathers-crl bot added the T-kv KV Team label Apr 18, 2023

erikgrinaker self-assigned this Apr 19, 2023

erikgrinaker added T-kv-replication and removed T-kv KV Team labels Apr 19, 2023

erikgrinaker mentioned this issue Apr 27, 2023

kv/kvnemesis: TestKVNemesisSingleNode failed #102410

Closed

cockroach-teamcity mentioned this issue Apr 27, 2023

kv/kvnemesis: TestKVNemesisMultiNode failed #102411

Closed

erikgrinaker added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Apr 27, 2023

cockroach-teamcity mentioned this issue May 1, 2023

kv/kvnemesis: TestKVNemesisMultiNode failed #100417

Closed

exalate-issue-sync bot assigned aliher1911 and unassigned erikgrinaker May 16, 2023

aliher1911 mentioned this issue May 22, 2023

kvserver/batcheval: fix mvcc stats computaion on split #103719

Merged

This was referenced May 28, 2023

kv/kvnemesis: TestKVNemesisMultiNode failed #104012

Closed

kv/kvnemesis: TestKVNemesisMultiNode failed #104183

Closed

aliher1911 removed their assignment Jun 6, 2023

cockroach-teamcity mentioned this issue Jun 7, 2023

kv/kvnemesis: TestKVNemesisMultiNode failed #104476

Closed

shralex assigned nvanbenschoten Jun 29, 2023

tbg added T-kv KV Team and removed T-kv-replication labels Jun 29, 2023

nvanbenschoten mentioned this issue Jul 5, 2023

kvserver: fail stale ConfChange when rejected by raft #106104

Merged

erikgrinaker mentioned this issue Jul 31, 2023

roachtest: tpccbench/nodes=3/cpu=16/mt-shared-process failed [replication span assertion; needs #107521] #107242

Closed

nvanbenschoten mentioned this issue Jul 31, 2023

kv: don't let pusher win when pushing STAGING txn with equal priority #107882

Merged

craig bot closed this as completed in 90516db Aug 3, 2023

pav-kv mentioned this issue Aug 7, 2023

kvserver: span assertions fire in replication layer #107521

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kv/kvnemesis: TestKVNemesisMultiNode failed #101721

kv/kvnemesis: TestKVNemesisMultiNode failed #101721

cockroach-teamcity commented Apr 18, 2023 •

edited by exalate-issue-sync bot

Loading

erikgrinaker commented Apr 18, 2023

cockroach-teamcity commented Apr 19, 2023

erikgrinaker commented Apr 19, 2023

blathers-crl bot commented Apr 19, 2023

cockroach-teamcity commented Apr 22, 2023

cockroach-teamcity commented Apr 23, 2023

cockroach-teamcity commented Apr 24, 2023

cockroach-teamcity commented Apr 26, 2023

cockroach-teamcity commented May 3, 2023

cockroach-teamcity commented May 17, 2023

cockroach-teamcity commented May 17, 2023

aliher1911 commented May 22, 2023 •

edited

Loading

aliher1911 commented May 22, 2023

aliher1911 commented Jun 6, 2023

aliher1911 commented Jun 6, 2023

cockroach-teamcity commented Jul 31, 2023

erikgrinaker commented Jul 31, 2023

kv/kvnemesis: TestKVNemesisMultiNode failed #101721

kv/kvnemesis: TestKVNemesisMultiNode failed #101721

Comments

cockroach-teamcity commented Apr 18, 2023 • edited by exalate-issue-sync bot Loading

erikgrinaker commented Apr 18, 2023

cockroach-teamcity commented Apr 19, 2023

erikgrinaker commented Apr 19, 2023

blathers-crl bot commented Apr 19, 2023

cockroach-teamcity commented Apr 22, 2023

cockroach-teamcity commented Apr 23, 2023

cockroach-teamcity commented Apr 24, 2023

cockroach-teamcity commented Apr 26, 2023

cockroach-teamcity commented May 3, 2023

cockroach-teamcity commented May 17, 2023

cockroach-teamcity commented May 17, 2023

aliher1911 commented May 22, 2023 • edited Loading

aliher1911 commented May 22, 2023

aliher1911 commented Jun 6, 2023

aliher1911 commented Jun 6, 2023

cockroach-teamcity commented Jul 31, 2023

erikgrinaker commented Jul 31, 2023

cockroach-teamcity commented Apr 18, 2023 •

edited by exalate-issue-sync bot

Loading

aliher1911 commented May 22, 2023 •

edited

Loading