Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: 10-byte discrepancy in SysBytes #93896

Open
erikgrinaker opened this issue Dec 19, 2022 · 10 comments · Fixed by #99017
Open

kvserver: 10-byte discrepancy in SysBytes #93896

erikgrinaker opened this issue Dec 19, 2022 · 10 comments · Fixed by #99017
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team

Comments

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Dec 19, 2022

We occasionally see 10-byte discrepancies in SysBytes, e.g. in these test failures:

The source of these discrepancies is unknown. We should find out what it is.

#93897 ignores this discrepancy in KVNemesis. We should remove that exception when this is fixed.

Jira issue: CRDB-22591

@erikgrinaker erikgrinaker added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team labels Dec 19, 2022
craig bot pushed a commit that referenced this issue Dec 19, 2022
93837: backupccl: deflake TestClusterRestoreFailCleanup r=stevendanna a=msbutler

This test occasionally flakes due to #86806. To prevent the flakiness, this patch manually sets the kv.bulkio.write_metadata_sst.enabled cluster setting to false. When #86806  gets addressed, this patch should be reverted.

Epic: None

Release note: None

93897: kvnemesis: ignore `SysBytes:10` MVCC stats discrepancy r=erikgrinaker a=erikgrinaker

Resolves #93890.
Touches #93896.
Touches #93312.
Touches #86542.

Epic: none
Release note: None

Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
@jbowens
Copy link
Collaborator

jbowens commented Jan 17, 2023

Another instance here: #95282

@exalate-issue-sync exalate-issue-sync bot added T-kv-replication and removed T-kv KV Team labels Jan 18, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jan 18, 2023

cc @cockroachdb/replication

@erikgrinaker
Copy link
Contributor Author

I've dug into some kvnemesis failures -- it reproduces within ~20 minutes or so, but rarely with the same seed. Adding some logging at the spots that mutate SysBytes, these updates in particular seem suspect:

storage/mvcc.go:296 ⋮ [T1] 1181  updateStatsOnPut(orig): SysBytes - 10 = -10 (/Local/Range/Min/‹RangeDescriptor›)

Will try to find out where they're coming from.

@erikgrinaker erikgrinaker self-assigned this Jan 18, 2023
@erikgrinaker
Copy link
Contributor Author

Grabbed a stack trace during these events, dumping it below since I have to run now.

goroutine 10126 [running]:
runtime/debug.Stack()
	GOROOT/src/runtime/debug/stack.go:24 +0x65
runtime/debug.PrintStack()
	GOROOT/src/runtime/debug/stack.go:16 +0x19
github.com/cockroachdb/cockroach/pkg/storage.updateStatsOnPut({0xc00704d8b4, 0x9, 0x9}, 0x0, 0x0, 0xa, 0x0, 0xa, _, 0xc001c78480, ...)
	github.com/cockroachdb/cockroach/pkg/storage/mvcc.go:299 +0x207
github.com/cockroachdb/cockroach/pkg/storage.mvccPutInternal({0x5fd0e08, 0xc000cda9c0}, {0x6041240, 0xc00a1b2000}, {0x6041160, 0xc003ea0900}, 0xc0090661e0, {0xc00704d8b4, 0x9, 0x9}, ...)
	github.com/cockroachdb/cockroach/pkg/storage/mvcc.go:2217 +0x1dc6
github.com/cockroachdb/cockroach/pkg/storage.mvccPutUsingIter({0x5fd0e08, 0xc000cda9c0}, {0x6041240, 0xc00a1b2000}, {0x6041160, 0xc003ea0900}, 0x435d8d?, {0xc00704d8b4, 0x9, 0x9}, ...)
	github.com/cockroachdb/cockroach/pkg/storage/mvcc.go:1549 +0x1cd
github.com/cockroachdb/cockroach/pkg/storage.mvccConditionalPutUsingIter({0x5fd0e08?, 0xc000cda9c0?}, {0x6041240?, 0xc00a1b2000?}, {0x6041160?, 0xc003ea0900?}, 0xc006170270?, {0xc00704d8b4, 0x9, 0x9}, ...)
	github.com/cockroachdb/cockroach/pkg/storage/mvcc.go:2398 +0x19c
github.com/cockroachdb/cockroach/pkg/storage.MVCCConditionalPut({0x5fd0e08, 0xc000cda9c0}, {0x7f85ba092d20?, 0xc00a1b2000}, 0x1?, {0xc00704d8b4, 0x9, 0x9}, {0x173b7872431ef313, 0x0, ...}, ...)
	github.com/cockroachdb/cockroach/pkg/storage/mvcc.go:2342 +0x274
github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval.ConditionalPut({_, _}, {_, _}, {{0x6056e00, 0xc0020d0460}, {{0x173b7872431ef313, 0x0, 0x0}, 0x0, ...}, ...}, ...)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval/cmd_conditional_put.go:63 +0x39c
github.com/cockroachdb/cockroach/pkg/kv/kvserver.evaluateCommand({_, _}, {_, _}, {_, _}, _, {{0x173b7872431ef313, 0x0, 0x0}, ...}, ...)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_evaluate.go:509 +0x309
github.com/cockroachdb/cockroach/pkg/kv/kvserver.evaluateBatch({_, _}, {_, _}, {_, _}, {_, _}, _, 0xc00baedef0, ...)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_evaluate.go:279 +0xd45
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).evaluateWriteBatchWrapper(_, {_, _}, {_, _}, {_, _}, _, _, 0xc0041b2dc0, ...)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_write.go:680 +0x206
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).evaluateWriteBatchWithServersideRefreshes(_, {_, _}, {_, _}, {_, _}, _, _, 0xc0041b2dc0, ...)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_write.go:640 +0x327
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).evaluateWriteBatch(_, {_, _}, {_, _}, _, _, _, {{0x173b787243b78993, 0x0, ...}, ...})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_write.go:440 +0x839
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).evaluateProposal(0xc003488000?, {0x5fd0e08, 0xc000cda9c0}, {0xc00606e5c8, 0x8}, 0xc00baedef0, 0x8?, 0x0?, {{0x173b787243b78993, 0x0, ...}, ...})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_proposal.go:660 +0x1df
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).requestToProposal(0x524ddb?, {0x5fd0e08?, 0xc000cda9c0}, {0xc00606e5c8, 0x8}, 0xc00baedef0, 0x173b7872431ef313?, 0xc006394248, {{0x173b787243b78993, 0x0, ...}, ...})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_proposal.go:748 +0xb2
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).evalAndPropose(0xc004323400, {0x5fd0e08, 0xc000cda9c0}, 0xc00baedef0, 0xc0041b2dc0, 0xc006394248, {{0x173b787243b78993, 0x0, 0x0}, {0x173b7872431ef313, ...}}, ...)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:121 +0x18f
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).executeWriteBatch(0xc004323400, {0x5fd0e08, 0xc000cda9c0}, 0xc00baedef0, 0xc0041b2dc0)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_write.go:175 +0x7aa
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).executeBatchWithConcurrencyRetries(0xc004323400, {0x5fd0e08, 0xc000cda9c0}, 0xc00baedef0, 0x4a52e60)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_send.go:491 +0x3e8
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).SendWithWriteBytes(0xc004323400, {0x5fd0e08?, 0xc000cda8d0?}, 0xc00baedef0)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_send.go:184 +0x70a
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).SendWithWriteBytes(0xc000b99500, {0x5fd0e08?, 0xc000cda870?}, 0xc00baedef0)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_send.go:205 +0x74a
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Stores).SendWithWriteBytes(0x0?, {0x5fd0e08, 0xc000cda870}, 0xc00baedef0)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/stores.go:203 +0x1aa
github.com/cockroachdb/cockroach/pkg/server.(*Node).batchInternal(0xc0023c4a80, {0x5fd0e08?, 0xc000cda810?}, {0x7718df?}, 0xc00baedef0)
	github.com/cockroachdb/cockroach/pkg/server/node.go:1120 +0x4f7
github.com/cockroachdb/cockroach/pkg/server.(*Node).Batch(0xc0023c4a80, {0x5fd0e08, 0xc000cda780}, 0xc00baedef0)
	github.com/cockroachdb/cockroach/pkg/server/node.go:1172 +0x192
github.com/cockroachdb/cockroach/pkg/rpc.makeInternalClientAdapter.func1({0x5fd0e08?, 0xc000cda780?}, {0x474ff20?, 0xc00baedef0?})
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:752 +0x4b
github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor.ServerInterceptor.func1({0x5fd0e08, 0xc000cda780}, {0x474ff20, 0xc00baedef0}, 0xc0020d14a0, 0xc00172e5d0)
	github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor/grpc_interceptor.go:95 +0x254
github.com/cockroachdb/cockroach/pkg/rpc.bindUnaryServerInterceptorToHandler.func1({0x5fd0e08?, 0xc000cda780?}, {0x474ff20?, 0xc00baedef0?})
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:861 +0x3a
github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func3({0x5fd0e08, 0xc000cda780}, {0x474ff20, 0xc00baedef0}, 0xc000cda780?, 0xc0020d14c0)
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:261 +0x83
github.com/cockroachdb/cockroach/pkg/rpc.bindUnaryServerInterceptorToHandler.func1({0x5fd0e08?, 0xc000cda780?}, {0x474ff20?, 0xc00baedef0?})
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:861 +0x3a
github.com/cockroachdb/cockroach/pkg/rpc.kvAuth.unaryInterceptor({{{0xc007633400?}}}, {0x5fd0e08, 0xc000cda780}, {0x474ff20, 0xc00baedef0}, 0xc0020d14a0, 0xc0020d14e0)
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/auth.go:73 +0x16e
github.com/cockroachdb/cockroach/pkg/rpc.bindUnaryServerInterceptorToHandler.func1({0x5fd0e08?, 0xc000cda780?}, {0x474ff20?, 0xc00baedef0?})
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:861 +0x3a
github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func1.1({0x5fd0e08?, 0xc000cda780?})
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:230 +0x39
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunTaskWithErr(0xc000b65a80, {0x5fd0e08, 0xc000cda780}, {0xc004f6e058?, 0x1?}, 0xc006177c58)
	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:322 +0xd1
github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func1({0x5fd0e08?, 0xc000cda780?}, {0x474ff20?, 0xc00baedef0?}, 0xc009695860?, 0xc0071ed400?)
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:228 +0x95
github.com/cockroachdb/cockroach/pkg/rpc.bindUnaryServerInterceptorToHandler.func1({0x5fd0e08?, 0xc000cda780?}, {0x474ff20?, 0xc00baedef0?})
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:861 +0x3a
github.com/cockroachdb/cockroach/pkg/rpc.makeInternalClientAdapter.func2({0x5fd0e08?, 0xc000cda780?}, {0xc000cda780?, 0x4?}, {0x474ff20?, 0xc00baedef0?}, {0x464c660?, 0xc005817e00}, 0x203000?, {0x0, ...})
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:762 +0x54
github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor.ClientInterceptor.func2({0x5fd0e08, 0xc000cda780}, {0x4803e42, 0x21}, {0x474ff20, 0xc00baedef0}, {0x464c660, 0xc005817e00}, 0x19f4b52?, 0xc003a81400, ...)
	github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor/grpc_interceptor.go:226 +0x155
github.com/cockroachdb/cockroach/pkg/rpc.getChainUnaryInvoker.func1({0x5fd0e08, 0xc000cda780}, {0x4803e42, 0x21}, {0x474ff20, 0xc00baedef0}, {0x464c660, 0xc005817e00}, 0x1f?, {0x0, ...})
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:945 +0x13e
github.com/cockroachdb/cockroach/pkg/rpc.makeInternalClientAdapter.func3({0x5fd0e08, 0xc000cda4e0}, 0xc00baede00, {0x0, 0x0, 0x0})
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:819 +0x329
github.com/cockroachdb/cockroach/pkg/rpc.internalClientAdapter.Batch(...)
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:953
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*grpcTransport).sendBatch(0xc006f4e720, {0x5fd0e08, 0xc000cda4e0}, 0x4369c8?, {0x5fcef40, 0xc005260d20?}, 0xc00baede00)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/transport.go:209 +0x102
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*grpcTransport).SendNext(0xc006f4e720, {0x5fd0e08, 0xc000cda4e0}, 0x5fd0e08?)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/transport.go:188 +0x92
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).sendToReplicas(0xc002004a00, {0x5fd0e08, 0xc000cda4e0}, 0xc00baedd10?, {0xc00166e640, 0xc00b670dd0, 0xc00b670e40, 0x0, 0x0}, 0x0)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:2142 +0x1163
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).sendPartialBatch(0xc002004a00, {0x5fd0e08?, 0xc000cda4e0}, 0xc00baedd10, {{0xc00704d8b7, 0x0, 0x6}, {0xc00704d8b7, 0x1, 0x6}}, ...)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:1668 +0x845
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).divideAndSendBatchToRanges(0xc002004a00, {0x5fd0e08, 0xc000cda4e0}, 0xc00baedd10, {{0xc00704d8b7, 0x0, 0x6}, {0xc00704d8b7, 0x1, 0x6}}, ...)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:1240 +0x3e8
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).Send(0xc002004a00, {0x5fd0e08, 0xc000cda420}, 0xc00baedd10)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:861 +0x678
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnLockGatekeeper).SendLocked(0xc00b20aab8, {0x5fd0e08, 0xc000cda420}, 0xc00baedd10)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_lock_gatekeeper.go:82 +0x1e2
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnMetricRecorder).SendLocked(0xc00b20aa80, {0x5fd0e08?, 0xc000cda420?}, 0x4794aaf?)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_metric_recorder.go:46 +0xe2
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnCommitter).SendLocked(0xc00b20aa50, {0x5fd0e08, 0xc000cda420}, 0xc00baedd10)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go:129 +0x65d
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnSpanRefresher).sendLockedWithRefreshAttempts(0xc00b20a950, {0x5fd0e08, 0xc000cda420}, 0xc00baedd10, 0x5)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go:225 +0x283
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnSpanRefresher).SendLocked(0xc00b20a950, {0x5fd0e08, 0xc000cda420}, 0xc00baedd10?)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go:153 +0xb3
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnPipeliner).SendLocked(0xc00b20a820, {0x5fd0e08, 0xc000cda420}, 0x9?)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_pipeliner.go:290 +0x138
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnSeqNumAllocator).SendLocked(0xc00b20a800?, {0x5fd0e08?, 0xc000cda420?}, 0xc00baedd10?)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_seq_num_allocator.go:104 +0x82
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnHeartbeater).SendLocked(0xc00b20a750, {0x5fd0e08, 0xc000cda420}, 0xc00baedd10)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_heartbeater.go:245 +0x4b2
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*TxnCoordSender).Send(0xc00b20a580, {0x5fd0e08, 0xc000cda3c0}, 0xc00baedd10)
	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_coord_sender.go:526 +0x52b
github.com/cockroachdb/cockroach/pkg/kv.(*DB).sendUsingSender(0xc002053830, {0x5fd0e08, 0xc000cda3c0}, 0xc00baedd10, {0x7f85b8c44f18, 0xc00b20a580})
	github.com/cockroachdb/cockroach/pkg/kv/db.go:994 +0xe7
github.com/cockroachdb/cockroach/pkg/kv.(*Txn).Send(0xc001cb60b0, {0x5fd0e08, 0xc000cda3c0}, 0xc00baedd10)
	github.com/cockroachdb/cockroach/pkg/kv/txn.go:1058 +0x209
github.com/cockroachdb/cockroach/pkg/kv.sendAndFill({0x5fd0e08, 0xc000cda3c0}, 0xc00617a968, 0xc00b20b600)
	github.com/cockroachdb/cockroach/pkg/kv/db.go:831 +0xf8
github.com/cockroachdb/cockroach/pkg/kv.(*Txn).Run(0xc001cb60b0, {0x5fd0e08, 0xc000cda3c0}, 0x9?)
	github.com/cockroachdb/cockroach/pkg/kv/txn.go:674 +0x74
github.com/cockroachdb/cockroach/pkg/kv/kvserver.execChangeReplicasTxn.func2({0x5fd0e08, 0xc0043ce900}, 0x5faad70?)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_command.go:2394 +0x613
github.com/cockroachdb/cockroach/pkg/kv.runTxn.func1({0x5fd0e08?, 0xc0043ce900?}, 0xc000ace510?)
	github.com/cockroachdb/cockroach/pkg/kv/db.go:956 +0x27
github.com/cockroachdb/cockroach/pkg/kv.(*Txn).exec(0xc001cb60b0, {0x5fd0e08, 0xc0043ce900}, 0xc006399050)
	github.com/cockroachdb/cockroach/pkg/kv/txn.go:927 +0xae
github.com/cockroachdb/cockroach/pkg/kv.runTxn({0x5fd0e08, 0xc0043ce900}, 0x7?, 0x522e360?)
	github.com/cockroachdb/cockroach/pkg/kv/db.go:955 +0x6b
github.com/cockroachdb/cockroach/pkg/kv.(*DB).TxnWithAdmissionControl(0x8ad5238?, {0x5fd0e08, 0xc0043ce900}, 0x853f1a8?, 0x0?, 0x4?)
	github.com/cockroachdb/cockroach/pkg/kv/db.go:918 +0x89
github.com/cockroachdb/cockroach/pkg/kv.(*DB).Txn(...)
	github.com/cockroachdb/cockroach/pkg/kv/db.go:897
github.com/cockroachdb/cockroach/pkg/kv/kvserver.execChangeReplicasTxn({0x5fd0e08?, 0xc0043ce900?}, 0xc001ffe700?, 0xc004cd2af0?, {0x47da02e?, 0x19?}, {0x0?, 0x0?}, {0xc00617b304, 0x1, ...}, ...)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_command.go:2292 +0x294
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).maybeLeaveAtomicChangeReplicasAndRemoveLearners(0xc004323400, {0x5fd0e08, 0xc0043ce900}, 0x0?)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_command.go:1373 +0x40f
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*replicateQueue).finalizeAtomicReplication(0xc004e4d6c0, {0x5fd0e08, 0xc0043ce900}, 0x0?)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replicate_queue.go:1969 +0x4b
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*replicateQueue).applyChange(0xc004e4d6c0?, {0x5fd0e08?, 0xc0043ce900?}, {0x2?, 0xc004323400?, {0x5fcec10?, 0x8ad5238?}})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replicate_queue.go:857 +0x191
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*replicateQueue).processOneChange(0xc004e4d6c0, {0x5fd0e08, 0xc0043ce900}, 0x47b232b?, 0xf?, 0x58?, 0x0)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replicate_queue.go:946 +0x153
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*replicateQueue).processOneChangeWithTracing(0xc004e4d6c0, {0x5fd0dd0, 0xc00862faa0}, 0xc004323400)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replicate_queue.go:792 +0x1b5
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*replicateQueue).process(0xc004e4d6c0, {0x5fd0dd0, 0xc00862faa0}, 0x5fd0dd0?, {0x7f85b7e87320, 0xc002000120})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replicate_queue.go:708 +0x20f
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*baseQueue).processReplica.func1({0x5fd0dd0, 0xc00862faa0})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/queue.go:997 +0x275
github.com/cockroachdb/cockroach/pkg/util/contextutil.RunWithTimeout({0x5fd0e08?, 0xc0043ce870?}, {0xc00867b3e0, 0x21}, 0xdf8475800, 0xc0042ade20)
	github.com/cockroachdb/cockroach/pkg/util/contextutil/context.go:91 +0xed
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*baseQueue).processReplica(0xc004078140, {0x5fd0e08, 0xc0043ce810}, {0x6016f38, 0xc004323400})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/queue.go:956 +0x3ee
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*baseQueue).processLoop.func2.1({0x5fd0e08, 0xc0043ce810})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/queue.go:874 +0x117
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x146
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx
	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:461 +0x43b

@erikgrinaker
Copy link
Contributor Author

I think the above was a red herring. This seems to be caused by range merges: after disabling range merges in kvnemesis, I haven't seen a failure in 10.000 runs (it used to fail after about 2000).

This is pure speculation, but I wonder if it could have something to do with the below code, which takes the stats computed during the RHS subsume request and then subtracts the range-local replicated data computed via the current batch. I'm going to try subtracting during subsume evaluation, which should avoid any races.

{
ridPrefix := keys.MakeRangeIDReplicatedPrefix(merge.RightDesc.RangeID)
sysMS, err := storage.ComputeStats(batch, ridPrefix, ridPrefix.PrefixEnd(), 0 /* nowNanos */)
if err != nil {
return result.Result{}, err
}
ms.Subtract(sysMS)
}

@erikgrinaker
Copy link
Contributor Author

That seems to have been it, passed 10k runs with the following patch. We should find out why -- either the batch we read through contains range-local writes to the RHS, or the subsume isn't preventing writes to the RHS range-local span (which seems bad).

diff --git a/pkg/kv/kvserver/batcheval/cmd_end_transaction.go b/pkg/kv/kvserver/batcheval/cmd_end_transaction.go
index 09fbecaaf7e..6ec1184ae8a 100644
--- a/pkg/kv/kvserver/batcheval/cmd_end_transaction.go
+++ b/pkg/kv/kvserver/batcheval/cmd_end_transaction.go
@@ -1282,15 +1282,6 @@ func mergeTrigger(
        }
        ms.Subtract(msRangeKeyDelta)
 
-       {
-               ridPrefix := keys.MakeRangeIDReplicatedPrefix(merge.RightDesc.RangeID)
-               sysMS, err := storage.ComputeStats(batch, ridPrefix, ridPrefix.PrefixEnd(), 0 /* nowNanos */)
-               if err != nil {
-                       return result.Result{}, err
-               }
-               ms.Subtract(sysMS)
-       }
-
        var pd result.Result
        pd.Replicated.Merge = &kvserverpb.Merge{
                MergeTrigger: *merge,
diff --git a/pkg/kv/kvserver/batcheval/cmd_subsume.go b/pkg/kv/kvserver/batcheval/cmd_subsume.go
index 05dcc8ddb02..885b1a8cd3d 100644
--- a/pkg/kv/kvserver/batcheval/cmd_subsume.go
+++ b/pkg/kv/kvserver/batcheval/cmd_subsume.go
@@ -135,6 +135,14 @@ func Subsume(
        reply.LeaseAppliedIndex = cArgs.EvalCtx.GetLeaseAppliedIndex()
        reply.FreezeStart = cArgs.EvalCtx.Clock().NowAsClockTimestamp()
 
+       // Remove the replicated range-local stats.
+       ridPrefix := keys.MakeRangeIDReplicatedPrefix(desc.RangeID)
+       sysMS, err := storage.ComputeStats(readWriter, ridPrefix, ridPrefix.PrefixEnd(), 0 /* nowNanos */)
+       if err != nil {
+               return result.Result{}, err
+       }
+       reply.MVCCStats.Subtract(sysMS)
+
        // Collect a read summary from the RHS leaseholder to ship to the LHS
        // leaseholder. This is used to instruct the LHS on how to update its
        // timestamp cache to ensure that no future writes are allowed to invalidate

@erikgrinaker
Copy link
Contributor Author

erikgrinaker commented Mar 20, 2023

I think this has to be related to resolution of the RHS range descriptor intent. Here's the list of stats updates to the RHS system keys of a merge with incorrect stats around a subsume (the first put is the initial split):

2023-03-20 04:39:07.248468063 +0000 UTC updateStatsOnPut /Local/Range/Table/100/"85280a5a52150f64"/RangeDescriptor
2023-03-20 04:39:07.260274902 +0000 UTC updateStatsOnResolve /Local/Range/Table/100/"85280a5a52150f64"/RangeDescriptor
2023-03-20 04:39:07.313033851 +0000 UTC updateStatsOnPut /Local/Range/Table/100/"85280a5a52150f64"/RangeDescriptor
2023-03-20 04:39:07.315114607 +0000 UTC subsume r82:/Table/100/"85{280a5a52150f64"-6c68caeb69cdfb"} [(n2,s2):1, (n3,s3):2, (n4,s4):3, next=4, gen=11, sticky=9223372036.854775807,2147483647]
2023-03-20 04:39:08.738719745 +0000 UTC updateStatsOnResolve /Local/Range/Table/100/"85280a5a52150f64"/RangeDescriptor

The RHS of the merge is considered local during the commit of the merge trigger:

if mergeTrigger := args.InternalCommitTrigger.GetMergeTrigger(); mergeTrigger != nil {
// If this is a merge, then use the post-merge descriptor to determine
// which locks are local (note that for a split, we want to use the
// pre-split one instead because it's larger).
desc = &mergeTrigger.LeftDesc
}

@erikgrinaker
Copy link
Contributor Author

On second thought, this can't be the RHS range descriptor intent, because that's not a range ID-local key, it's a range-local key. The range ID-local keys are these:

cockroach/pkg/keys/keys.go

Lines 1034 to 1071 in 0b2c235

// AbortSpanKey returns a range-local key by Range ID for an AbortSpan
// entry, with detail specified by encoding the supplied transaction ID.
func (b RangeIDPrefixBuf) AbortSpanKey(txnID uuid.UUID) roachpb.Key {
key := append(b.replicatedPrefix(), LocalAbortSpanSuffix...)
return encoding.EncodeBytesAscending(key, txnID.GetBytes())
}
// RangeAppliedStateKey returns a system-local key for the range applied state key.
// See comment on RangeAppliedStateKey function.
func (b RangeIDPrefixBuf) RangeAppliedStateKey() roachpb.Key {
return append(b.replicatedPrefix(), LocalRangeAppliedStateSuffix...)
}
// RangeLeaseKey returns a system-local key for a range lease.
func (b RangeIDPrefixBuf) RangeLeaseKey() roachpb.Key {
return append(b.replicatedPrefix(), LocalRangeLeaseSuffix...)
}
// RangePriorReadSummaryKey returns a system-local key for a range's prior read
// summary.
func (b RangeIDPrefixBuf) RangePriorReadSummaryKey() roachpb.Key {
return append(b.replicatedPrefix(), LocalRangePriorReadSummarySuffix...)
}
// RangeGCThresholdKey returns a system-local key for the GC threshold.
func (b RangeIDPrefixBuf) RangeGCThresholdKey() roachpb.Key {
return append(b.replicatedPrefix(), LocalRangeGCThresholdSuffix...)
}
// RangeGCHintKey returns a range-local key for the GC hint data.
func (b RangeIDPrefixBuf) RangeGCHintKey() roachpb.Key {
return append(b.replicatedPrefix(), LocalRangeGCHintSuffix...)
}
// RangeVersionKey returns a system-local key for the range version.
func (b RangeIDPrefixBuf) RangeVersionKey() roachpb.Key {
return append(b.replicatedPrefix(), LocalRangeVersionSuffix...)
}

The range applied state key is written below Raft, but it's omitted by ComputeStats().

I think this is likely the range lease, which is a range ID-local key. Looking at a previous case, notice that we're evaluating a range lease for the RHS r82 after we've subsumed it, but before we commit the merge:

2023-03-20 04:39:07.248468063 +0000 UTC updateStatsOnPut /Local/Range/Table/100/"85280a5a52150f64"/RangeDescriptor
2023-03-20 04:39:07.260274902 +0000 UTC updateStatsOnResolve /Local/Range/Table/100/"85280a5a52150f64"/RangeDescriptor
2023-03-20 04:39:07.313033851 +0000 UTC updateStatsOnPut /Local/Range/Table/100/"85280a5a52150f64"/RangeDescriptor
2023-03-20 04:39:07.315114607 +0000 UTC subsume r82:/Table/100/"85{280a5a52150f64"-6c68caeb69cdfb"} [(n2,s2):1, (n3,s3):2, (n4,s4):3, next=4, gen=11, sticky=9223372036.854775807,2147483647]
2023-03-20 04:39:08.730691492 +0000 UTC updateStatsForInline /Local/RangeID/82/r/RangeLease
2023-03-20 04:39:08.738719745 +0000 UTC updateStatsOnResolve /Local/Range/Table/100/"85280a5a52150f64"/RangeDescriptor

RequestLease doesn't respect latches, so it can bypass the subsume:

case req.isSingle(kvpb.RequestLease):
// Ignore latches for lease requests. These requests are run on replicas
// that do not hold the lease, so acquiring latches wouldn't help
// synchronize with other requests.
return true

Here's another repro, for good measure:

2023-03-20 10:37:54.303828225 +0000 UTC updateStatsOnPut /Local/Range/Table/100/"0cb12e93699a7938"/RangeDescriptor
2023-03-20 10:37:54.332712827 +0000 UTC subsume r91:/Table/100/"{0cb12e93699a7938"-2a3131233c675613"} [(n3,s3):1, (n2,s2):2, (n4,s4):3, next=4, gen=10, sticky=9223372036.854775807,2147483647]
2023-03-20 10:37:54.339613925 +0000 UTC updateStatsForInline /Local/RangeID/91/r/RangeLease
2023-03-20 10:37:54.344374607 +0000 UTC updateStatsOnResolve /Local/Range/Table/100/"0cb12e93699a7938"/RangeDescriptor

In fact, it appears to have been the subsume request itself that fired off the lease extension, to extend an expiration lease in the last half of the lease interval (notice the "not acquiring latches"):

    36.034ms      0.070ms                        === operation:/cockroach.roachpb.Internal/Batch _verbose:1 node:3 span.kind:server request:Subsume [/Table/100/"0cb12e93699a7938",/Min)
    36.034ms      0.000ms                        [request range lease: {count: 1, duration 10ms, unfinished}]
    36.034ms      0.000ms                        [pendingLeaseRequest: requesting lease: {count: 1, duration 10ms, unfinished}]
    36.167ms      0.133ms                        event:server/node.go:1151 [n3] node received request: 1 Subsume
    36.230ms      0.063ms                        event:kv/kvserver/store_send.go:167 [n3,s3] executing Subsume [/Table/100/"0cb12e93699a7938",/Min)
    36.329ms      0.099ms                        event:kv/kvserver/replica_send.go:179 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] read-only path
    36.441ms      0.112ms                        event:kv/kvserver/concurrency/concurrency_manager.go:194 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] sequencing request
    36.452ms      0.011ms                        event:kv/kvserver/concurrency/concurrency_manager.go:275 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] acquiring latches
    36.594ms      0.142ms                        event:kv/kvserver/replica_range_lease.go:1441 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] extending lease repl=(n3,s3):1 seq=2 start=1679308671.328864005,0 exp=1679308677.328703166,0 pro=1679308671.328703166,0 at 1679308674.330435020,0
    36.609ms      0.014ms                            === operation:request range lease _unfinished:1 _verbose:1 node:3 store:3 range:91/1:/Table/100/"{0cb12e…-2a3131…}
    36.609ms      0.000ms                            [pendingLeaseRequest: requesting lease: {count: 1, duration 10ms, unfinished}]
    36.619ms      0.010ms                                === operation:pendingLeaseRequest: requesting lease _unfinished:1 _verbose:1 node:3 store:3 range:91/1:/Table/100/"{0cb12e…-2a3131…}
    45.424ms      8.805ms                                event:kv/kvserver/replica_send.go:183 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] read-write path
    45.453ms      0.029ms                                event:kv/kvserver/concurrency/concurrency_manager.go:194 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] sequencing request
    45.487ms      0.034ms                                event:kv/kvserver/concurrency/concurrency_manager.go:238 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] not acquiring latches
    45.505ms      0.018ms                                event:kv/kvserver/replica_write.go:165 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] applied timestamp cache
    45.541ms      0.036ms                                event:kv/kvserver/replica_write.go:400 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] executing read-write batch
    45.787ms      0.246ms                                event:kv/kvserver/replica_evaluate.go:550 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] evaluated RequestLease command header:<key:"\354\0220cb12e93699a7938\000\001" > lease:<start:<wall_time:1679308674330435020 > expiration:<wall_time:1679308680330435020 > replica:<node_id:3 store_id:3 replica_id:1 type:VOTER_FULL > proposed_ts:<wall_time:1679308674330435020 > > prev_l..., txn=<nil> : resp=header:<> , err=<nil>
    45.811ms      0.024ms                                event:kv/kvserver/replica_proposal.go:764 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] need consensus on write batch with op count=1
    45.836ms      0.025ms                                event:kv/kvserver/replica_raft.go:124 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] evaluated request
    45.853ms      0.017ms                                event:kv/kvserver/replica_raft.go:168 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] proposing command to write 0 new keys, 0 new values, 0 new intents, write batch size=107 bytes
    36.894ms      0.300ms                        event:kv/kvserver/replica_read.go:408 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] executing read-only batch
    46.736ms      0.633ms                        event:kv/kvserver/replica_evaluate.go:550 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] evaluated Subsume command header:<key:"\354\0220cb12e93699a7938\000\001" > left_desc:<range_id:60 start_key:"\303" end_key:"\354\0220cb12e93699a7938\000\001" internal_replicas:<node_id:3 store_id:3 replica_id:4 type:VOTER_FULL > internal_replicas:<node_id:2 store_id:2 replica_id..., txn=<nil> : resp=header:<> mvcc_stats:<contains_estimates:0 last_update_nanos:1679308673854507556 intent_age:0 gc_bytes_age:0 live_bytes:0 live_count:0 key_bytes:0 key_count:0 val_bytes:0 val_count:0 intent_bytes:0 intent_count:0 separated_intent_count:0 range_key_count..., err=<nil>
    46.775ms      0.039ms                        event:kv/kvserver/replica_read.go:221 [n3,s3,r91/1:/Table/100/"{0cb12e…-2a3131…}] read completed

I'm calling it. I'll submit a PR that computes the range ID-local stats delta during the subsume itself, so that it's consistent with RightMVCCStats.

@erikgrinaker
Copy link
Contributor Author

I'll submit a PR that computes the range ID-local stats delta during the subsume itself, so that it's consistent with RightMVCCStats.

Submitted #99017, but I don't think it's sufficient to compute this during subsume evaluation, because that can still race with an MVCC stats update being applied.

craig bot pushed a commit that referenced this issue Mar 21, 2023
98225: ui: show normalized CPU Usage metric on Node Map r=koorosh a=koorosh

Before, Node map (on Overview page) displayed current system and user CPU usage 
that didn't represent the same data as CPU Percent metric on Metrics page.
Now, Node Map displays the same metric to provide users consistent information.

Release note (admin ui, bug fix): show normalized CPU usage on Node Map.
Resolve: #87664

98985: storage: scan local keyspace in `TestMVCCHistories` r=erikgrinaker a=erikgrinaker

This patch processes the local keyspace along with the user keyspace in `TestMVCCHistories`, which is useful for MVCC stats tests of system keys (e.g. `SysBytes`). This was motivated by tracking down an observed discrepancy in `SysBytes`, but that hasn't borne fruit yet.

Touches #93896.

Epic: none
Release note: None

99050: ccl/sqlproxyccl/acl: fix TestParsingErrorHandling flake r=adityamaru,jaylim-crl a=pjtatlow

TestParsingErrorHandling asserts that the error count metric is updated
correctly when reading a file fails or succeeds, and the files are checked
at a regular interval. For the tests that interval is set to 100ms, and we waited
200ms to ensure the metric would be updated, but that seems to not be reliable.
This change increases the wait to 500ms which should ensure the file is
re-read before we check the value of the error metics.

Fixes #98839

Co-authored-by: Andrii Vorobiov <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
Co-authored-by: PJ Tatlow <[email protected]>
@craig craig bot closed this as completed in 3c3d2a5 Mar 27, 2023
jbowens added a commit to jbowens/cockroach that referenced this issue May 31, 2023
Previously, the checks=true configuration of the clearrange roachtest would
fatal on stats divergences. Due to cockroachdb#93896, this test can fatal with a SysBytes
divergence. This is fixed on master, but the fix will not be backported to 23.1
or 22.2. Disable the enforcement of consistent stats on this branch.

Fixes cockroachdb#104078.
Informs cockroachdb#104011.
Epic: none
Release note: none
Release justification: non-production code changes
jbowens added a commit to jbowens/cockroach that referenced this issue May 31, 2023
Previously, the checks=true configuration of the clearrange roachtest would
fatal on stats divergences. Due to cockroachdb#93896, this test can fatal with a SysBytes
divergence. This is fixed on master, but the fix will not be backported to 23.1
or 22.2. Disable the enforcement of consistent stats on this branch.

Fixes cockroachdb#104011.
Epic: none
Release note: none
Release justification: non-production code changes
itsbilal added a commit to itsbilal/cockroach that referenced this issue Aug 15, 2023
Previously, the clearrange roachtest was the only place anywhere in
the CockroachDB codebase where we would assert on MVCC stats matching
between replicas. This would trip up and fail the clearrange roachtest
even in known cases of MVCC stats mismatches. This change removes the
code to assert on stats mismatches with consistency checks, but retains
the clearrange roachtest's use of aggressive consistency checks, so
mismatches in checksums computed on data in each replica will
continue to fatal the test.

Related to cockroachdb#93896.

Fixes cockroachdb#108726.

Epic: none

Release note: None
craig bot pushed a commit that referenced this issue Aug 16, 2023
108673: kvstreamer: minor cleanup and unification r=yuzefovich a=yuzefovich

This commit performs some minor cleanup to unify the code a bit between GetResp and ScanResp. The only change that is not a noop is the fact that we're now `nil`ing out `ResumeSpan` on GetResponses as well as on ScanResponses when SingleRowLookup hint is `false`. Originally, we were unsetting it for all ScanResponses but in 6343df3 this was lost making the behavior different based on the hint; GetResponses never had this `nil`ing out in the first place. The rationale for actually unsetting the ResumeSpan for both types of the responses is somewhat weak (not confusing the Streamer's user (which doesn't actually inspect the ResumeSpan field) as well as to allow for GC of the keys sooner), but it's better for this behavior to be unified.

Epic: None

Release note: None

108786: roachtest: allow stats mismatches in clearrange roachtest r=RaduBerinde a=itsbilal

Previously, the clearrange roachtest was the only place anywhere in the CockroachDB codebase where we would assert on MVCC stats matching between replicas. This would trip up and fail the clearrange roachtest even in known cases of MVCC stats mismatches. This change removes the code to assert on stats mismatches with consistency checks, but retains the clearrange roachtest's use of aggressive consistency checks, so mismatches in checksums computed on data in each replica will continue to fatal the test.

Related to #93896.

Fixes #108726.

Epic: none

Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Bilal Akhtar <[email protected]>
@arulajmani
Copy link
Collaborator

Re-opening this given the PR description of #99017,.

@arulajmani arulajmani reopened this Sep 26, 2024
arulajmani added a commit to arulajmani/cockroach that referenced this issue Sep 26, 2024
We started ignoring these in 398dfc7. However, that commit was
effectively reverted when we addressed cockroachdb#93896 in 2d855d3. However, like
2d855d3 notes in its description, this was best effort. As such, this
can still cause test flakes (e.g. cockroachdb#131187).

This patch effectively resurrects 398dfc7, but also additionally
quietens `SysBytes:-10` failure mdoes.

Epic: none

Release note: None
@pav-kv pav-kv added the T-kv KV Team label Sep 27, 2024
craig bot pushed a commit that referenced this issue Sep 27, 2024
131446: kvnemesis: ignore SysBytes:{,-}10 MVCC stats discrepancy r=pav-kv a=arulajmani

We started ignoring these in 398dfc7. However, that commit was effectively reverted when we addressed #93896 in 2d855d3. However, like 2d855d3 notes in its description, this was best effort. As such, this can still cause test flakes (e.g. #131187).

This patch effectively resurrects 398dfc7, but also additionally quietens `SysBytes:-10` failure mdoes.

Epic: none

Release note: None

Co-authored-by: Arul Ajmani <[email protected]>
blathers-crl bot pushed a commit that referenced this issue Sep 27, 2024
We started ignoring these in 398dfc7. However, that commit was
effectively reverted when we addressed #93896 in 2d855d3. However, like
2d855d3 notes in its description, this was best effort. As such, this
can still cause test flakes (e.g. #131187).

This patch effectively resurrects 398dfc7, but also additionally
quietens `SysBytes:-10` failure mdoes.

Epic: none

Release note: None
blathers-crl bot pushed a commit that referenced this issue Sep 27, 2024
We started ignoring these in 398dfc7. However, that commit was
effectively reverted when we addressed #93896 in 2d855d3. However, like
2d855d3 notes in its description, this was best effort. As such, this
can still cause test flakes (e.g. #131187).

This patch effectively resurrects 398dfc7, but also additionally
quietens `SysBytes:-10` failure mdoes.

Epic: none

Release note: None
blathers-crl bot pushed a commit that referenced this issue Sep 27, 2024
We started ignoring these in 398dfc7. However, that commit was
effectively reverted when we addressed #93896 in 2d855d3. However, like
2d855d3 notes in its description, this was best effort. As such, this
can still cause test flakes (e.g. #131187).

This patch effectively resurrects 398dfc7, but also additionally
quietens `SysBytes:-10` failure mdoes.

Epic: none

Release note: None
cthumuluru-crdb pushed a commit to cthumuluru-crdb/cockroach that referenced this issue Oct 1, 2024
We started ignoring these in 398dfc7. However, that commit was
effectively reverted when we addressed cockroachdb#93896 in 2d855d3. However, like
2d855d3 notes in its description, this was best effort. As such, this
can still cause test flakes (e.g. cockroachdb#131187).

This patch effectively resurrects 398dfc7, but also additionally
quietens `SysBytes:-10` failure mdoes.

Epic: none

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants