-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: possible batch mutation during commit [inconsistent batch count] #113268
Comments
I see a few possible explanations here:
@cockroachdb/kv – want to check if code auditing reveals some possibility of mutating a batch while it's being committed in this code path? Feel free to close out otherwise, and Storage will address the two filed issues. |
Add a guardrail that ensures a batch isn't in the process of being committed when applying a mutation to the Batch. We've observed panics during batch application that suggest concurrent modification of the batch, and these guardrails will help surface any races more readily. Informs cockroachdb/cockroach#113268 Resolve cockroachdb#3024.
Add a guardrail that ensures a batch isn't in the process of being committed when applying a mutation to the Batch. We've observed panics during batch application that suggest concurrent modification of the batch, and these guardrails will help surface any races more readily. Informs cockroachdb/cockroach#113268 Resolve cockroachdb#3024.
Add a guardrail that ensures a batch isn't in the process of being committed when applying a mutation to the Batch. We've observed panics during batch application that suggest concurrent modification of the batch, and these guardrails will help surface any races more readily. Informs cockroachdb/cockroach#113268 Resolve cockroachdb#3024.
Add a guardrail that ensures a batch isn't in the process of being committed when applying a mutation to the Batch. We've observed panics during batch application that suggest concurrent modification of the batch, and these guardrails will help surface any races more readily. Informs cockroachdb/cockroach#113268 Resolve #3024.
cc @cockroachdb/replication |
@pavelkalinnikov Can you have a look at this when you get a chance? |
@jbowens The batch construction and commit in this code path is sequential and isolated (not reused by other paths), I couldn't quickly find a way in which it could be modified while committing. Basically it's confined within this block: cockroach/pkg/kv/kvserver/apply/task.go Lines 274 to 292 in e23a2ea
|
@jbowens Does this error happen only at a single-batch scale? Can it happen that some batch somewhere else is modified while committing, but our batch happens to notice this? |
Single batch scale, although batches are pooled, so it could conceivably be the result of a batch elsewhere receiving writes after it’s been closed and reused. |
Add a guardrail that ensures a batch isn't in the process of being committed when applying a mutation to the Batch. We've observed panics during batch application that suggest concurrent modification of the batch, and these guardrails will help surface any races more readily. Informs cockroachdb/cockroach#113268 Resolve cockroachdb#3024.
This issue was auto filed by Sentry. It represents a crash or reported error on a live cluster with telemetry enabled.
Sentry Link: https://cockroach-labs.sentry.io/issues/4583696564/?referrer=webhooks_plugin
Panic Message:
Stacktrace (expand for inline code snippets):
GOROOT/src/runtime/asm_amd64.s#L1593-L1595
cockroach/pkg/util/stop/stopper.go
Lines 469 to 471 in 810d4f2
https://github.com/cockroachdb/cockroach/blob/810d4f27a7f02b9cc2750cab654ed1c62ac3e75a/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go#L301-L303
https://github.com/cockroachdb/cockroach/blob/810d4f27a7f02b9cc2750cab654ed1c62ac3e75a/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go#L394-L396
https://github.com/cockroachdb/cockroach/blob/810d4f27a7f02b9cc2750cab654ed1c62ac3e75a/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go#L645-L647
https://github.com/cockroachdb/cockroach/blob/810d4f27a7f02b9cc2750cab654ed1c62ac3e75a/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go#L717-L719
https://github.com/cockroachdb/cockroach/blob/810d4f27a7f02b9cc2750cab654ed1c62ac3e75a/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go#L1004-L1006
cockroach/pkg/kv/kvserver/apply/task.go
Lines 250 to 252 in 810d4f2
cockroach/pkg/kv/kvserver/apply/task.go
Lines 289 to 291 in 810d4f2
https://github.com/cockroachdb/cockroach/blob/810d4f27a7f02b9cc2750cab654ed1c62ac3e75a/pkg/kv/kvserver/pkg/kv/kvserver/replica_app_batch.go#L561-L563
cockroach/pkg/storage/pebble_batch.go
Lines 548 to 550 in 810d4f2
github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/batch.go#L1108-L1110
github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/db.go#L745-L747
github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/db.go#L810-L812
Tags
Jira issue: CRDB-32838
The text was updated successfully, but these errors were encountered: