Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv/concurrency: race around updating txnCache #105244

Closed
yuzefovich opened this issue Jun 21, 2023 · 0 comments · Fixed by #105476
Closed

kv/concurrency: race around updating txnCache #105244

yuzefovich opened this issue Jun 21, 2023 · 0 comments · Fixed by #105476
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team

Comments

@yuzefovich
Copy link
Member

yuzefovich commented Jun 21, 2023

Seen on an unrelated PR here:

=== RUN   TestCancelMultipleQueued
    test_log_scope.go:167: test logs captured to: /artifacts/tmp/_tmp/b287de7194e17a0386ee1775f0d198ba/logTestCancelMultipleQueued2272501544
    test_log_scope.go:81: use -show-logs to present logs inline
==================
WARNING: DATA RACE
Write at 0x00c006b07dc4 by goroutine 1886348:
  github.com/cockroachdb/cockroach/pkg/util/hlc.(*Timestamp).Forward()
      github.com/cockroachdb/cockroach/pkg/util/hlc/pkg/util/hlc/timestamp.go:342 +0x1ed7
  github.com/cockroachdb/cockroach/pkg/roachpb.(*Transaction).Update()
      github.com/cockroachdb/cockroach/pkg/roachpb/pkg/roachpb/data.go:1301 +0x1d83
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency.(*txnCache).add()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency/lock_table_waiter.go:960 +0x48f
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency.(*txnStatusCache).add()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency/lock_table_waiter.go:927 +0x6a
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency.(*lockTableImpl).PushedTransactionUpdated()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency/lock_table.go:3160 +0x3e
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency.(*lockTableWaiterImpl).pushLockTxn()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency/lock_table_waiter.go:533 +0x836
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency.(*lockTableWaiterImpl).WaitOn.func3()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency/lock_table_waiter.go:380 +0x564
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency.(*lockTableWaiterImpl).WaitOn()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency/lock_table_waiter.go:432 +0x62f
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency.(*managerImpl).sequenceReqWithGuard()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency/concurrency_manager.go:345 +0xfa1
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency.(*managerImpl).SequenceReq()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency/concurrency_manager.go:239 +0x328
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).executeBatchWithConcurrencyRetries()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_send.go:466 +0x476
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).SendWithWriteBytes()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_send.go:183 +0x524
  github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).SendWithWriteBytes()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_send.go:193 +0xa71
...

Previous read at 0x00c006b07dc0 by goroutine 1886303:
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/txnwait.createPushTxnResponse()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/txnwait/queue.go:146 +0x223b
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/txnwait.(*Queue).waitForPush()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/txnwait/queue.go:638 +0x22be
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/txnwait.(*Queue).MaybeWaitForPush.func2()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/txnwait/queue.go:547 +0x90
  runtime/pprof.Do()
      GOROOT/src/runtime/pprof/runtime.go:40 +0x122
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/txnwait.(*Queue).MaybeWaitForPush()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/txnwait/queue.go:546 +0x14a4
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency.(*managerImpl).maybeInterceptReq()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency/concurrency_manager.go:364 +0xe7
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency.(*managerImpl).sequenceReqWithGuard()
      github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency/concurrency_manager.go:267 +0x20f
...

It seems that 23ee0a9 is to blame.

Jira issue: CRDB-28936

@yuzefovich yuzefovich added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jun 21, 2023
@blathers-crl blathers-crl bot added the T-kv KV Team label Jun 21, 2023
craig bot pushed a commit that referenced this issue Jun 26, 2023
105391: colinfo: add version gate for storing pg_lsn types r=rafiss a=otan

We don't want mixed version clusters storing pg_lsn types, in case they need to rollback / older versions do not understand the type.

Informs #105130

Release note: None

105444: systemschema: stop running PostDeserializationChanges when building tables r=Xiang-Gu a=rafiss

### systemschema: add JobInfo to MakeSystemTables helper

This table was missing from the list, and the expected count of tables
was off since we weren't accounting for non-system tenant tables.

The function is only used for testing, so there was no impact.

---

### systemschema: stop running PostDeserializationChanges when building tables

We would like to remove old PostDeserializationChanges that are no
longer needed. In order to do so, we need to stop relying on them to
build system tables.

Instead, now we adjust the hard-coded system table descriptors and the
related helpers so that they create valid descriptors. This needed two
changes:
- Update the ConstraintID for check constraints.
- Update the primary index encoding so that it includes stored columns.

---

Epic: None
Release note: None

105476: kv: fix data race when updating pending txn in txnStatusCache r=arulajmani a=nvanbenschoten

Fixes #105244.

This commit avoids a data race by treating *roachpb.Transaction objects as immutable, and simply choosing the right transaction to keep in the cache when there is a choice to be made.

The behavior of this logic is tested by `TestTxnCacheUpdatesTxn`.

Release note: None

105480: kv: fix data race during retry of EndTxn after refresh r=arulajmani a=nvanbenschoten

Fixes #103687.
Fixes #103247.
Fixes #104791.

This commit avoids a data race between `splitEndTxnAndRetrySend` and `raceTransport` by avoiding a mutation of a shared `RequestUnion_EndTxn` object within an unshared `RequestUnion` object. The `raceTransport` makes an effort to copy the `BatchRequest`'s `RequestUnion` slice, but it does not copy the inner interface, so we can't play tricks to avoid a reallocation of the `RequestUnion_EndTxn`.

The commit also addresses a similar problem in
`retryTxnCommitAfterFailedParallelCommit`.

We may be able to fix this in the `raceTransport`, but doing so would require some reflection magic and this is currently failing CI, so we make the easier change.

Release note: None

105515: pgwire: fix race in TestConn r=knz a=rafiss

fixes #105410

A recent refactor introduced this race, since the context is used by two testing goroutines.

Release note: None

Co-authored-by: Oliver Tan <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig craig bot closed this as completed in 6017bce Jun 26, 2023
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jul 13, 2023
Fixes cockroachdb#105244.

This commit avoids a data race by treating *roachpb.Transaction objects
as immutable, and simply choosing the right transaction to keep in the
cache when there is a choice to be made.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jul 13, 2023
Fixes cockroachdb#105244.

This commit avoids a data race by treating *roachpb.Transaction objects
as immutable, and simply choosing the right transaction to keep in the
cache when there is a choice to be made.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants