Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database/sql: deadlock on transaction stmt context cancel #40985

Closed
tz70s opened this issue Aug 23, 2020 · 10 comments
Closed

database/sql: deadlock on transaction stmt context cancel #40985

tz70s opened this issue Aug 23, 2020 · 10 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@tz70s
Copy link
Contributor

tz70s commented Aug 23, 2020

What version of Go are you using (go version)?

$ go version
go version go1.14.7 darwin/amd64

Does this issue reproduce with the latest release?

Not reproducible before Go 1.14.6.

I believe this issue is introduced by this CL: https://go-review.googlesource.com/c/go/+/216240/
And this issue: #39101

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/tzuchiao.yeh/Library/Caches/go-build"
GOENV="/Users/tzuchiao.yeh/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOOS="darwin"
GOPATH="/Users/tzuchiao.yeh/Workspace/golang/golang"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/tzuchiao.yeh/.gvm/gos/go1.14.7"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/tzuchiao.yeh/.gvm/gos/go1.14.7/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/tzuchiao.yeh/Workspace/golang/sqlctx/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build298045727=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Please try this test case:

func TestSqlStmtDeadlock(t *testing.T) {
	db, err := sql.Open("mysql", "root@tcp(127.0.0.1:3306)/test_db")
	if err != nil {
		t.Fatalf("unable to open database, err: %v", err)
	}
	defer db.Close()

	ctx, cancel := context.WithCancel(context.Background())
	tx, err := db.BeginTx(ctx, nil)
	if err != nil {
		t.Fatalf("unable to begin a transaction, err: %v", err)
	}

	stmt, err := tx.Prepare("INSERT INTO test_tab (name) VALUES (?)")
	if err != nil {
		t.Fatalf("unable to prepare a stmt, err: %v", err)
	}

	go func() {
		time.Sleep(10 * time.Millisecond)
		log.Print("trigger context cancel")
		cancel()
	}()

	for i := 0; i < 100; i++ {
		_, err = stmt.Exec("Peter")
		if err != nil {
			t.Errorf("failed to execute transaction, err: %v", err)
		}
	}
	tx.Rollback()
}

What did you expect to see?

Should show up failed to execute transactions.

What did you see instead?

Process hang (deadlock).

Some analysis:

In CL https://go-review.googlesource.com/c/go/+/216240/ the tx.closemu.Lock is raised up before tc.closePrepared,
which internally call statements closed, requires stmt.closemu.Lock.

This path is triggered by context cancel.

On the other-hand, there's still an on-fly statement execution:

stmt.Exec calls stmt.closemu.Rlock until whole execution finished, internally it calls tx.grabConn and then acquire tx.closemu.Rlock.

Consider the case:

stmt.Exec acquires the stmt.closemu.Rlock first, but blocked by tx.closemu.Lock due to transaction rollback, and for transaction rollback, after it acquires the tx.closemu, it still can't make any progress due to stmt.closemu is held by stmt.Exec.

Hence cause the deadlock.

@tz70s tz70s changed the title database/sql: deadlock on stmt and context cancel database/sql: deadlock on transaction stmt context cancel Aug 23, 2020
@ianlancetaylor
Copy link
Member

CC @kardianos

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 23, 2020
@ianlancetaylor ianlancetaylor added this to the Go1.16 milestone Aug 23, 2020
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/250178 mentions this issue: database/sql: fix tx stmt deadlock when rollback

@tz70s
Copy link
Contributor Author

tz70s commented Aug 24, 2020

Just had one possible solution by temporarily unlock stmt.closemu to break deadlock, submitted the patch.

@kardianos
Copy link
Contributor

Thank you for the reproduction. While I haven't looked too closely yet, the current CL may not be workable because it releases and re-acquires a lock. I greatly appreciate the test case that can reproduce this.

@tz70s
Copy link
Contributor Author

tz70s commented Aug 27, 2020

Thank you for the reply. Maybe a better way is refactoring to reduce two locks around tx and stmt, though I still think re-check stmt.close after re-acquire lock is sufficient to eliminate race condition on stmt.Exec/Query (R-lock) and stmt.Close (W-lock)? Seems I can add more test cases for this behavior.

@tz70s
Copy link
Contributor Author

tz70s commented Oct 28, 2020

Hi I'm wondering if the fix is possible to backport to 1.14?

@pjm0616
Copy link

pjm0616 commented Nov 30, 2020

@gopherbot @odeke-em Please consider this for backport to 1.15 and 1.14.
This regression, introduced in 1.14.6 and 1.15(by #34775), has a serious impact(deadlock) and no workarounds, other than patching Go manually.

@gopherbot
Copy link
Contributor

Backport issue(s) opened: #42883 (for 1.14), #42884 (for 1.15).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Dec 30, 2020
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/284512 mentions this issue: [release-branch.go1.14] database/sql: fix tx stmt deadlock when rollback

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/284513 mentions this issue: [release-branch.go1.15] database/sql: fix tx stmt deadlock when rollback

gopherbot pushed a commit that referenced this issue Mar 30, 2021
Tx acquires tx.closemu W-lock and then acquires stmt.closemu.W-lock
to fully close the transaction and associated prepared statement.
Stmt query and execution run in reverse ways - acquires
stmt.closemu.R-lock and then acquires tx.closemu.R-lock to grab tx
connection, which may cause deadlock.

Prevent the lock is held around tx.closePrepared to ensure no
deadlock happens.

Includes a test fix from CL 266097.
Fixes #42884
Updates #40985
Updates #42259

Change-Id: Id52737660ada3cebdfff6efc23366cdc3224b8e8
Reviewed-on: https://go-review.googlesource.com/c/go/+/250178
Run-TryBot: Emmanuel Odeke <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Reviewed-by: Daniel Theophanes <[email protected]>
Trust: Emmanuel Odeke <[email protected]>
(cherry picked from commit d4c1ad8)
Reviewed-on: https://go-review.googlesource.com/c/go/+/284513
Reviewed-by: Dmitri Shuralyov <[email protected]>
@golang golang locked and limited conversation to collaborators Jan 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

6 participants