Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/sql/sqlstats/insights: data race when running TestConsumeJoinToken #85988

Closed
renatolabs opened this issue Aug 11, 2022 · 4 comments
Closed
Assignees
Labels
A-sql-observability Related to observability of the SQL layer C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@renatolabs
Copy link
Contributor

renatolabs commented Aug 11, 2022

#85350 introduced a data race observed when running the TestConsumeJoinToken test. The test has been failing pretty consistently since then (see TC history).

Stack:

==================
WARNING: DATA RACE
Read at 0x00c001648f68 by goroutine 451:
  github.com/cockroachdb/cockroach/pkg/sql/sqlstats/insights.(*concurrentBufferIngester).ObserveStatement()
      <autogenerated>:1 +0x48
  github.com/cockroachdb/cockroach/pkg/sql/sqlstats/ssmemstorage.(*Container).RecordStatement()
      github.com/cockroachdb/cockroach/pkg/sql/sqlstats/ssmemstorage/ss_mem_writer.go:178 +0x1608
  github.com/cockroachdb/cockroach/pkg/sql/sqlstats/persistedsqlstats.(*ApplicationStats).RecordStatement.func1()
      github.com/cockroachdb/cockroach/pkg/sql/sqlstats/persistedsqlstats/pkg/sql/sqlstats/persistedsqlstats/appStats.go:41 +0x100
  github.com/cockroachdb/cockroach/pkg/sql/sqlstats/persistedsqlstats.(*ApplicationStats).recordStatsOrSendMemoryPressureSignal()
      github.com/cockroachdb/cockroach/pkg/sql/sqlstats/persistedsqlstats/pkg/sql/sqlstats/persistedsqlstats/appStats.go:65 +0x34
  github.com/cockroachdb/cockroach/pkg/sql/sqlstats/persistedsqlstats.(*ApplicationStats).RecordStatement()
      github.com/cockroachdb/cockroach/pkg/sql/sqlstats/persistedsqlstats/pkg/sql/sqlstats/persistedsqlstats/appStats.go:40 +0xa0
  github.com/cockroachdb/cockroach/pkg/sql/sqlstats/sslocal.(*StatsCollector).RecordStatement()
      <autogenerated>:1 +0xd8
  github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).recordStatementSummary()
      github.com/cockroachdb/cockroach/pkg/sql/executor_statement_metrics.go:202 +0x9fc
  github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).dispatchToExecutionEngine()
      github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1210 +0x1264
  github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmtInOpenState()
      github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:699 +0x2060
  github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmt()
      github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:148 +0x6b0
  github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd.func1()
      github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1943 +0x518
  github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd()
      github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1947 +0xe0c
  github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run()
      github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1869 +0x2d8
  github.com/cockroachdb/cockroach/pkg/sql.(*InternalExecutor).initConnEx.func1()
      github.com/cockroachdb/cockroach/pkg/sql/internal.go:212 +0xe4

Previous write at 0x00c001648f68 by goroutine 25:
  github.com/cockroachdb/cockroach/pkg/sql/sqlstats/insights.newConcurrentBufferIngester.func2()
      github.com/cockroachdb/cockroach/pkg/sql/sqlstats/insights/pkg/sql/sqlstats/insights/ingester.go:161 +0xbc
  github.com/cockroachdb/cockroach/pkg/sql/contention/contentionutils.(*ConcurrentBufferGuard).syncLocked()
      github.com/cockroachdb/cockroach/pkg/sql/contention/contentionutils/concurrent_buffer_guard.go:135 +0x58
  github.com/cockroachdb/cockroach/pkg/sql/contention/contentionutils.(*ConcurrentBufferGuard).ForceSync()
      github.com/cockroachdb/cockroach/pkg/sql/contention/contentionutils/concurrent_buffer_guard.go:120 +0x68
  github.com/cockroachdb/cockroach/pkg/sql/sqlstats/insights.concurrentBufferIngester.Start.func2()
      github.com/cockroachdb/cockroach/pkg/sql/sqlstats/insights/pkg/sql/sqlstats/insights/ingester.go:80 +0x58
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:489 +0x150

Goroutine 451 (running) created at:
  github.com/cockroachdb/cockroach/pkg/sql.(*InternalExecutor).initConnEx()
      github.com/cockroachdb/cockroach/pkg/sql/internal.go:211 +0x838
  github.com/cockroachdb/cockroach/pkg/sql.(*InternalExecutor).execInternal()
      github.com/cockroachdb/cockroach/pkg/sql/internal.go:743 +0x8a0
  github.com/cockroachdb/cockroach/pkg/sql.(*InternalExecutor).ExecEx()
      github.com/cockroachdb/cockroach/pkg/sql/internal.go:540 +0x1e8
  github.com/cockroachdb/cockroach/pkg/sql.(*InternalExecutor).Exec()
      github.com/cockroachdb/cockroach/pkg/sql/internal.go:521 +0x10c
  github.com/cockroachdb/cockroach/pkg/sql/catalog/lease.storage.acquire.func1()
      github.com/cockroachdb/cockroach/pkg/sql/catalog/lease/storage.go:157 +0x87c
  github.com/cockroachdb/cockroach/pkg/kv.runTxn.func1()
      github.com/cockroachdb/cockroach/pkg/kv/db.go:963 +0x50
  github.com/cockroachdb/cockroach/pkg/kv.(*Txn).exec()
      github.com/cockroachdb/cockroach/pkg/kv/txn.go:960 +0x74
  github.com/cockroachdb/cockroach/pkg/kv.runTxn()
      github.com/cockroachdb/cockroach/pkg/kv/db.go:962 +0x5c
  github.com/cockroachdb/cockroach/pkg/kv.(*DB).TxnWithAdmissionControl()
      github.com/cockroachdb/cockroach/pkg/kv/db.go:925 +0xc0
  github.com/cockroachdb/cockroach/pkg/kv.(*DB).Txn()
      github.com/cockroachdb/cockroach/pkg/kv/db.go:904 +0x1ec
  github.com/cockroachdb/cockroach/pkg/sql/catalog/lease.storage.acquire()
      github.com/cockroachdb/cockroach/pkg/sql/catalog/lease/storage.go:170 +0x1d4
  github.com/cockroachdb/cockroach/pkg/sql/catalog/lease.acquireNodeLease.func1()
      github.com/cockroachdb/cockroach/pkg/sql/catalog/lease/lease.go:492 +0x258
  github.com/cockroachdb/cockroach/pkg/util/syncutil/singleflight.(*Group).doCall()
      github.com/cockroachdb/cockroach/pkg/util/syncutil/singleflight/singleflight.go:128 +0x40
  github.com/cockroachdb/cockroach/pkg/util/syncutil/singleflight.(*Group).DoChan.func1()
      github.com/cockroachdb/cockroach/pkg/util/syncutil/singleflight/singleflight.go:121 +0x68

Goroutine 25 (running) created at:
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:480 +0x488
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:351 +0x230
  github.com/cockroachdb/cockroach/pkg/sql/sqlstats/insights.concurrentBufferIngester.Start()
      github.com/cockroachdb/cockroach/pkg/sql/sqlstats/insights/pkg/sql/sqlstats/insights/ingester.go:74 +0x13c
  github.com/cockroachdb/cockroach/pkg/sql/sqlstats/insights.(*concurrentBufferIngester).Start()
      <autogenerated>:1 +0x88
  github.com/cockroachdb/cockroach/pkg/sql.(*Server).Start()
      github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:528 +0x150
  github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*Server).Start()
      github.com/cockroachdb/cockroach/pkg/sql/pgwire/server.go:412 +0x238
  github.com/cockroachdb/cockroach/pkg/server.(*SQLServer).preStart()
      github.com/cockroachdb/cockroach/pkg/server/server_sql.go:1238 +0x1fc
  github.com/cockroachdb/cockroach/pkg/server.(*Server).PreStart()
      github.com/cockroachdb/cockroach/pkg/server/server.go:1565 +0x3404
  github.com/cockroachdb/cockroach/pkg/server.(*Server).Start()
      github.com/cockroachdb/cockroach/pkg/server/server.go:964 +0x3c
  github.com/cockroachdb/cockroach/pkg/server.(*TestServer).Start()
      github.com/cockroachdb/cockroach/pkg/server/testserver.go:577 +0x54
  github.com/cockroachdb/cockroach/pkg/testutils/serverutils.StartServer()
      github.com/cockroachdb/cockroach/pkg/testutils/serverutils/test_server_shim.go:328 +0x154
  github.com/cockroachdb/cockroach/pkg/server.TestConsumeJoinToken()
      github.com/cockroachdb/cockroach/pkg/server/addjoin_test.go:38 +0x1dc
  testing.tRunner()
      GOROOT/src/testing/testing.go:1439 +0x18c
  testing.(*T).Run.func1()
      GOROOT/src/testing/testing.go:1486 +0x44

Jira issue: CRDB-18514

@renatolabs renatolabs added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Aug 11, 2022
@matthewtodd matthewtodd added A-sql-observability Related to observability of the SQL layer T-sql-observability labels Aug 11, 2022
@matthewtodd
Copy link
Contributor

This may be related to #83080.

renatolabs added a commit to renatolabs/cockroach that referenced this issue Aug 11, 2022
@ajwerner
Copy link
Contributor

I think the bug here is that the methods on concurrentBufferIngester are by value and the fullHandler thus is not modifying the data structure that is in use in the method invocation -- so the flush is not made visible to the ops.

@ajwerner
Copy link
Contributor

Give this a shot:

--- a/pkg/sql/sqlstats/insights/ingester.go
+++ b/pkg/sql/sqlstats/insights/ingester.go
@@ -55,7 +55,7 @@ type event struct {
        statement   *Statement
 }
 
-func (i concurrentBufferIngester) Start(ctx context.Context, stopper *stop.Stopper) {
+func (i *concurrentBufferIngester) Start(ctx context.Context, stopper *stop.Stopper) {
        // This task pulls buffers from the channel and forwards them along to the
        // underlying Registry.
        _ = stopper.RunAsyncTask(ctx, "insights-ingester", func(ctx context.Context) {
@@ -86,7 +86,7 @@ func (i concurrentBufferIngester) Start(ctx context.Context, stopper *stop.Stopp
        })
 }
 
-func (i concurrentBufferIngester) ingest(events *eventBuffer) {
+func (i *concurrentBufferIngester) ingest(events *eventBuffer) {
        for _, e := range events {
                // Because an eventBuffer is a fixed-size array, rather than a slice,
                // we do not know how full it is until we hit a nil entry.
@@ -101,7 +101,7 @@ func (i concurrentBufferIngester) ingest(events *eventBuffer) {
        }
 }
 
-func (i concurrentBufferIngester) ObserveStatement(
+func (i *concurrentBufferIngester) ObserveStatement(
        sessionID clusterunique.ID, statement *Statement,
 ) {
        if !i.enabled() {
@@ -115,7 +115,7 @@ func (i concurrentBufferIngester) ObserveStatement(
        })
 }
 
-func (i concurrentBufferIngester) ObserveTransaction(
+func (i *concurrentBufferIngester) ObserveTransaction(
        sessionID clusterunique.ID, transaction *Transaction,
 ) {
        if !i.enabled() {
@@ -129,13 +129,13 @@ func (i concurrentBufferIngester) ObserveTransaction(
        })
 }
 
-func (i concurrentBufferIngester) IterateInsights(
+func (i *concurrentBufferIngester) IterateInsights(
        ctx context.Context, visitor func(context.Context, *Insight),
 ) {
        i.delegate.IterateInsights(ctx, visitor)
 }
 
-func (i concurrentBufferIngester) enabled() bool {
+func (i *concurrentBufferIngester) enabled() bool {
        return i.delegate.enabled()
 }

craig bot pushed a commit that referenced this issue Aug 16, 2022
86090: CODEOWNERS: stop dev-inf PR review requests for gen file list changes r=rickystewart a=jlinder

Changes to most of the generated pkg/gen/*.bzl files happen for many PRs
and they don't need to be reviewed by dev inf. Let's stop getting PR
review requests for them.

Release note: None
Release justification: non-production code change

86146: insights: call ingester methods by reference r=matthewtodd a=matthewtodd

Fixes the data race observed in #85988. Passing this data structure by
value meant that the `fullHandler` was working with a different object
than the external callers.

Release justification: Category 3: Fixes for high-priority or
high-severity bugs in existing functionality

Release note: None

86191: sql/schemachanger/sctest: run the Backup test over all stages r=ajwerner a=ajwerner

Before this change, we'd only run them over the non-revertible stages.

Release justification: testing only change

Release note: None

Co-authored-by: James H. Linder <[email protected]>
Co-authored-by: Matthew Todd <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
@matthewtodd
Copy link
Contributor

This is fixed by #86146.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-observability Related to observability of the SQL layer C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

No branches or pull requests

3 participants