Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl operation triggers the implicit commit,ddl operation fails occasionally #24591

Closed
vivid392845427 opened this issue May 12, 2021 · 2 comments · Fixed by #24601
Closed

ddl operation triggers the implicit commit,ddl operation fails occasionally #24591

vivid392845427 opened this issue May 12, 2021 · 2 comments · Fixed by #24601
Labels
severity/major sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug.

Comments

@vivid392845427
Copy link

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

1、do some transaction operation

mysql> create table t(a int, b varchar(20), primary key(a,b(3)));
ERROR 1105 (HY000): tikv aborts txn: Txn(Mvcc(PessimisticLockNotFound { start_ts: TimeStamp(424884930163769345), key: [116, 128, 0, 0, 0, 0, 0, 27, 42, 95, 105, 128, 0, 0, 0, 0, 0, 0, 1, 3, 128, 0, 0, 0, 0, 0, 0, 1, 1, 103, 114, 101, 97, 116, 32, 109, 101, 255, 0, 0, 0, 0, 0, 0, 0, 0, 247] }))
mysql> create table t(a int, b varchar(20), primary key(a,b(3)));
Query OK, 0 rows affected (0.12 sec)

2. What did you expect to see? (Required)

create table sucessful

3. What did you see instead (Required)

create table fail

mysql> create table t(a int, b varchar(20), primary key(a,b(3)));
ERROR 1105 (HY000): tikv aborts txn: Txn(Mvcc(PessimisticLockNotFound { start_ts: TimeStamp(424884930163769345), key: [116, 128, 0, 0, 0, 0, 0, 27, 42, 95, 105, 128, 0, 0, 0, 0, 0, 0, 1, 3, 128, 0, 0, 0, 0, 0, 0, 1, 1, 103, 114, 101, 97, 116, 32, 109, 101, 255, 0, 0, 0, 0, 0, 0, 0, 0, 247] }))

log:
[2021/05/12 08:46:29.137 +00:00] [INFO] [gc_worker.go:1531] ["[gc worker] sent safe point to PD"] [uuid=5e56bb02384000c] ["safe point"=424884930163769345]
[2021/05/12 08:47:49.771 +00:00] [INFO] [client_batch.go:791] ["recycle idle connection"] [target=172.16.4.202:10162]
[2021/05/12 08:47:49.772 +00:00] [INFO] [client_batch.go:791] ["recycle idle connection"] [target=172.16.4.202:10161]
[2021/05/12 08:54:41.146 +00:00] [INFO] [session.go:2764] ["CRUCIAL OPERATION"] [conn=2933] [schemaVersion=6930] [cur_db=zy] [sql="create table t(a int, b varchar(20), primary key(a,b(3)))"] [[email protected]]
[2021/05/12 08:54:41.160 +00:00] [WARN] [snapshot.go:641] ["2PC failed"] [error="tikv aborts txn: Txn(Mvcc(PessimisticLockNotFound { start_ts: TimeStamp(424884930163769345), key: [116, 128, 0, 0, 0, 0, 0, 27, 42, 95, 105, 128, 0, 0, 0, 0, 0, 0, 1, 3, 128, 0, 0, 0, 0, 0, 0, 1, 1, 103, 114, 101, 97, 116, 32, 109, 101, 255, 0, 0, 0, 0, 0, 0, 0, 0, 247] }))"]
[2021/05/12 08:54:41.161 +00:00] [WARN] [session.go:549] ["can not retry txn"] [conn=2933] [label=general] [error="tikv aborts txn: Txn(Mvcc(PessimisticLockNotFound { start_ts: TimeStamp(424884930163769345), key: [116, 128, 0, 0, 0, 0, 0, 27, 42, 95, 105, 128, 0, 0, 0, 0, 0, 0, 1, 3, 128, 0, 0, 0, 0, 0, 0, 1, 1, 103, 114, 101, 97, 116, 32, 109, 101, 255, 0, 0, 0, 0, 0, 0, 0, 0, 247] }))"] [IsBatchInsert=false] [IsPessimistic=true] [InRestrictedSQL=false] [tidb_retry_limit=10] [tidb_disable_txn_auto_retry=true]
[2021/05/12 08:54:41.161 +00:00] [WARN] [session.go:564] ["commit failed"] [conn=2933] ["finished txn"="Txn{state=invalid}"] [error="tikv aborts txn: Txn(Mvcc(PessimisticLockNotFound { start_ts: TimeStamp(424884930163769345), key: [116, 128, 0, 0, 0, 0, 0, 27, 42, 95, 105, 128, 0, 0, 0, 0, 0, 0, 1, 3, 128, 0, 0, 0, 0, 0, 0, 1, 1, 103, 114, 101, 97, 116, 32, 109, 101, 255, 0, 0, 0, 0, 0, 0, 0, 0, 247] }))"]
[2021/05/12 08:54:41.161 +00:00] [INFO] [tidb.go:242] ["rollbackTxn for ddl/autocommit failed"]
[2021/05/12 08:54:41.161 +00:00] [WARN] [session.go:1429] ["run statement failed"] [conn=2933] [schemaVersion=6930] [error="tikv aborts txn: Txn(Mvcc(PessimisticLockNotFound { start_ts: TimeStamp(424884930163769345), key: [116, 128, 0, 0, 0, 0, 0, 27, 42, 95, 105, 128, 0, 0, 0, 0, 0, 0, 1, 3, 128, 0, 0, 0, 0, 0, 0, 1, 1, 103, 114, 101, 97, 116, 32, 109, 101, 255, 0, 0, 0, 0, 0, 0, 0, 0, 247] }))"] [session="{\n "currDBName": "zy",\n "id": 2933,\n "status": 2,\n "strictMode": true,\n "user": {\n "Username": "root",\n "Hostname": "127.0.0.1",\n "CurrentUser": false,\n "AuthUsername": "root",\n "AuthHostname": "%"\n }\n}"]

4. What is your TiDB version? (Required)

Release Version: v5.0.1
Edition: Community
Git Commit Hash: 1145e34
Git Branch: heads/refs/tags/v5.0.1
UTC Build Time: 2021-04-23 05:51:17
GoVersion: go1.13
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false

@vivid392845427 vivid392845427 added the type/bug The issue is confirmed as a bug. label May 12, 2021
@cfzjywxk
Copy link
Contributor

cfzjywxk commented May 12, 2021

The PessimisticLockNotFound error is unexpected, the process is:
424884930163769345 is pessimistic transaction which is started at 2021-05-12 16:16:21.301 +0800 CST, The max-txn-ttl is default value 1 hour.

[2021/05/12 08:34:42.128 +00:00] [INFO] [2pc.go:763] ["send TxnHeartBeat"] [startTS=424884930163769345] [newTTL=1120800]
[2021/05/12 08:34:48.937 +00:00] [INFO] [gc_worker.go:411] ["[gc worker] gc safepoint blocked by a running session"] [uuid=5e56bb02384000c] [globalMinStartTS=424884930163769345] [safePoint=424885063228325888]
[2021/05/12 08:34:48.970 +00:00] [INFO] [gc_worker.go:312] ["[gc worker] starts the whole job"] [uuid=5e56bb02384000c] [safePoint=424884930163769345] [concurrency=3]
[2021/05/12 08:34:48.974 +00:00] [INFO] [gc_worker.go:1007] ["[gc worker] start resolve locks"] [uuid=5e56bb02384000c] [safePoint=424884930163769345] [concurrency=3]
[2021/05/12 08:34:48.975 +00:00] [INFO] [range_task.go:108] ["range task started"] [name=resolve-locks-runner] [startKey=] [endKey=] [concurrency=3]
[2021/05/12 08:34:49.042 +00:00] [INFO] [lock_resolver.go:274] ["BatchResolveLocks: lookup txn status"] ["cost time"=2.993593ms] ["num of txn"=1]
[2021/05/12 08:34:49.053 +00:00] [INFO] [lock_resolver.go:314] ["BatchResolveLocks: resolve locks in a batch"] ["cost time"=11.341786ms] ["num of locks"=36]
[2021/05/12 08:34:49.172 +00:00] [INFO] [range_task.go:214] ["range task finished"] [name=resolve-locks-runner] [startKey=] [endKey=] ["cost time"=193.846204ms] ["completed regions"=158]
[2021/05/12 08:34:49.172 +00:00] [INFO] [gc_worker.go:1028] ["[gc worker] finish resolve locks"] [uuid=5e56bb02384000c] [safePoint=424884930163769345] [regions=158]
[2021/05/12 08:34:52.128 +00:00] [INFO] [2pc.go:763] ["send TxnHeartBeat"] [startTS=424884930163769345] [newTTL=1130800]
[2021/05/12 08:34:52.130 +00:00] [WARN] [2pc.go:769] ["send TxnHeartBeat failed"] [error="txn 424884930163769345 heartbeat fail, primary key = 748000000000001b2a5f698000000000000001038000000000000021016372617a7920616cff0000000000000000f7, err = txn 424884930163769345 not found"] [txnStartTS=424884930163769345]

From the log, the heartbeat at 2021/05/12 08:34:52.130 +00:00 as the primary key lock is not found. 424884930163769345 is used as the safepoint for gc, the lock resolving phase during this gc round, the primary lock is resolved. This is because the gc lock resolving will collect locks with lock.ts <= max_ts and the max_ts is the safepoint, and check txn status request will use maxUint64 ts as the current_ts.

So the pessimistic transaction which should live up to max-txn-ttl is aborted unexpectedly by the gc process.

To solve this, a simple way is decrement the calculated value by one before setting the real safepoint.

@ti-srebot
Copy link
Contributor

Please edit this comment or add a new comment to complete the following information

Not a bug

  1. Remove the 'type/bug' label
  2. Add notes to indicate why it is not a bug

Duplicate bug

  1. Add the 'type/duplicate' label
  2. Add the link to the original bug

Bug

Note: Make Sure that 'component', and 'severity' labels are added
Example for how to fill out the template: #20100

1. Root Cause Analysis (RCA) (optional)

2. Symptom (optional)

3. All Trigger Conditions (optional)

4. Workaround (optional)

5. Affected versions

6. Fixed versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/major sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants