-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: schemachange/index/tpcc/w=1000 failed #68958
Labels
C-test-failure
Broken test (automatically or manually discovered).
O-roachtest
O-robot
Originated from a bot.
release-blocker
Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
T-sql-foundations
SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Milestone
Comments
cockroach-teamcity
added
branch-release-21.1
C-test-failure
Broken test (automatically or manually discovered).
O-roachtest
O-robot
Originated from a bot.
release-blocker
Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
labels
Aug 14, 2021
Seems like the same deadlock as #68951. |
craig bot
pushed a commit
that referenced
this issue
Aug 19, 2021
69040: sql: fix deadlock when updating backfill progress r=ajwerner a=ajwerner The root cause here is that we acquired the mutex inside the transaction which also laid down intents. This was not a problem in earlier iterations of this code because of the FOR UPDATE logic which would, generally, in theory, order the transactions such that the first one to acquire the mutex would be the first to lay down an intent, thus avoiding the deadlock by ordering the acquisitions. That was changed in #68244, which removed the FOR UPDATE. What we see now is that you have a transaction doing the progress update which hits a restart but has laid down an intent. Then we have a transaction which is doing a details update that starts and acquires the mutex but blocks on the intent of the other transaction. That other transaction now is blocked on the mutex and we have a deadlock. The solution here is to not acquire the mutex inside these transactions. Instead, the code copies out the relevant state prior to issuing the transaction. The cost here should be pretty minimal and the staleness in the fact of retries is the least of my concerns. No release note because the code in #68244 has never been released. Touches #68951, #68958. Release note: None Co-authored-by: Andrew Werner <[email protected]>
roachtest.schemachange/index/tpcc/w=1000 failed with artifacts on release-21.1 @ c425111e138297bcacd1370cbe40263dd00e64ac:
Reproduce
To reproduce, try: # From https://go.crdb.dev/p/roachstress, perhaps edited lightly.
caffeinate ./roachstress.sh schemachange/index/tpcc/w=1000 Same failure on other branches
|
Should be fixed by #69130. |
healthy-pod
added
T-sql-foundations
SQL Foundations Team (formerly SQL Schema + SQL Sessions)
and removed
T-sql-schema-deprecated
Use T-sql-foundations instead
labels
May 17, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-test-failure
Broken test (automatically or manually discovered).
O-roachtest
O-robot
Originated from a bot.
release-blocker
Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
T-sql-foundations
SQL Foundations Team (formerly SQL Schema + SQL Sessions)
roachtest.schemachange/index/tpcc/w=1000 failed with artifacts on release-21.1 @ 22dad757f6f5ba0d0a10ce3ccdf9712e54cf1a56:
Reproduce
To reproduce, try:
# From https://go.crdb.dev/p/roachstress, perhaps edited lightly. caffeinate ./roachstress.sh schemachange/index/tpcc/w=1000
Same failure on other branches
This test on roachdash | Improve this report!
The text was updated successfully, but these errors were encountered: