-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
report "update partition record fails" error when upgrade from v4.0.16 to v5.2.0 #30489
Comments
data collected by clinic: "https://clinic.pingcap.com:4433/diag/files?uuid=9813e1e4294438a6-116e126a69cc4232-45efdaf37f025cbd" |
this tidb node is scale-out after upgrade |
PTAL @mjonss @tiancaiamao |
the testcases report this error: |
@seiya-annie What is the configuration of the cluster? I noticed that 'amend transaction' is enabled but not sure if it is related to this error. It would be helpful if we can have a configuration for the cluster as well. |
This is the same issue with #28292 It's introduced in 5.1 by the cherry-pick PR #21148 It's a critical problem and not easy to fix, see our internal document https://pingcap.feishu.cn/docs/doccnDrJ22mWwkFi9NkUfzwlADc |
dup of #28292 |
Please check whether the issue should be labeled with 'affects-x.y' or 'fixes-x.y.z', and then remove 'needs-more-info' label. |
also exists in v5.3.1 |
--- /tmp/ddl__negative__table_partition__truncate_partition_2.531154131/a 2022-03-22 07:50:37.956115775 +0000 |
--- /tmp/ddl__negative__table_partition__truncate_partition_3.699959318/a 2022-03-22 07:50:38.351116125 +0000 |
--- /tmp/ddl__negative__table_partition__drop_partition_3.222860356/a 2022-03-22 07:50:34.110112366 +0000 |
2022/03/22 07:50:31 [ddl/all.jsonnet#ddl/negative/table_partition__drop_partition_1] passed |
reopen it because issue still exist in v6.0.0 |
i'm still analysing this, but I think there are more issues that can trigger this. I moved the check for equal length for row and colIDs to be done in RemoveRecord which than will fail for several other cases, like TestInsertOnDuplicateKey, with a read row longer than number of colIDs. The test is just not using binary log, so that is why it has no issues. |
These tests are also using I have not found an easy reproducible test case, but the issue can be triggered by QA, but it needs to be reduced for further investigation. So my current conclusion is that binlog has similar issues (data and columnID count not match), see #33608. @bb7133 & @seiya-annie should we keep this open or close it as duplicate on any of the above bugs? If we keep it open, can we lower the severity? |
I execute the SQL in table_partition__drop_partition_2.r.sql and table_partition__drop_partition_3.r.sql, |
I am able to reproduce this issue, it is actually not related to 'upgrade': Setup a TiDB cluster with 5.4.0 and nightly(I verified on those versions but I believe it can be reproduced for the others) with binlog enabled, for example:
Then:
And the following error is reported:
|
Great! So the row already includes the |
Exactly, it looks like that this code expect the data without
@tiancaiamao could you confirm this and see if you have any idea on how to fix it? |
It seems it is corrected for normal deletes like this, so I think we can just use that fix too? |
Lowered the severity, since binlog is being deprecated, and there are other limitation between binlog and other features. |
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
upgrade from v4.0.16 to v5.2.0
after upgrade, run stmtflow test ddl/all.jsonnet test. found following error in log:
2021/12/07 17:08:16.565 +08:00] [INFO] [domain.go:129] ["diff load InfoSchema success"] [currentSchemaVersion=4402] [neededSchemaVersion=4403] ["start time"=898.82µss
] [phyTblIDs="[]"] [actionTypes="[]"]
[2021/12/07 17:08:16.567 +08:00] [INFO] [schema_validator.go:291] ["the schema validator enqueue, queue is too long"] ["delta max count"=1024] ["remove schema version"=1863]
[2021/12/07 17:08:16.630 +08:00] [INFO] [domain.go:129] ["diff load InfoSchema success"] [currentSchemaVersion=4403] [neededSchemaVersion=4404] ["start time"=602.692µ�
s] [phyTblIDs="[]"] [actionTypes="[]"]
[2021/12/07 17:08:16.886 +08:00] [INFO] [domain.go:129] ["diff load InfoSchema success"] [currentSchemaVersion=4404] [neededSchemaVersion=4405] ["start time"=4.286041ms] [phyTblIDs="[2515,2516,2517]"] [actionTypes="[8,8,8]"]
[2021/12/07 17:08:16.965 +08:00] [ERROR] [partition.go:1218] ["update partition record fails"] [message="new record inserted while old record is not removed"] [error="EncodeRow error: data and columnID count not match 4 vs 3"]
2. What did you expect to see? (Required)
no error
3. What did you see instead (Required)
4. What is your TiDB version? (Required)
v5.2.0
The text was updated successfully, but these errors were encountered: