Data decrease in shard-mode when change auto-increment in downstream #1895

XuJianxu · 2021-07-20T11:38:21Z

Bug Report

Please answer these questions before submitting your issue. Thanks!

What did you do? If possible, provide a recipe for reproducing the error.
200 upstream mysql, 200* dm worker, 3*dm master, 200K QPS+TPS in upstream.
Upgrade dm cluster from 2.0.1 to nightly。
What did you expect to see?
After upgrade process completed, data will go on to migrate to downstream tidb.
What did you see instead?
The data in specified table decreased.
Versions of the cluster
- DM version (run dmctl -V or dm-worker -V or dm-master -V):
```
nightly
```
- Upstream MySQL/MariaDB server version:
```
mysql5.7/5.8
```
- Downstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):
```
4.0.10
```
- How did you deploy DM: DM-Ansible or manually?
```
DM-Ansible
```
- Other interesting information (system version, hardware config, etc):
current status of DM cluster (execute query-status in dmctl)
Operation logs
- Please upload dm-worker.log for every DM-worker instance if possible
- Please upload dm-master.log if possible
- Other interesting logs
- Output of dmctl's commands with problems
Configuration of the cluster and the task
- dm-worker.toml for every DM-worker instance if possible
- dm-master.toml for DM-master if possible
- task config, like task.yaml if possible
- inventory.ini if deployed by DM-Ansible
Screenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus for DM if possible

The text was updated successfully, but these errors were encountered:

lance6716 · 2021-07-21T02:44:28Z

This is caused by after changing the table structure in downstream, user didn't tell DM so DM will generate DELETE DML in safemode like

DELETE FROM `db_test`.`table_shard` WHERE `primary_id` = 123456 LIMIT 1

which matches many rows in downstream since it's a shard merging task.

GMHDBJD · 2021-07-21T03:02:47Z

In this case, should we support operate-schema before sync start?

lance6716 · 2021-07-21T10:01:18Z

currently, we have almost no practical action to operate-schema set a correct schema before sync unit starts, so there's a change that in this sharding merge the WHERE clause of UDPATE and DELETE will modify unexpected records.

GMHDBJD · 2021-07-21T10:10:18Z

currently, we have almost no practical action to operate-schema set a correct schema before sync unit starts, so there's a change that in this sharding merge the WHERE clause of UDPATE and DELETE will modify unexpected records.

I think if we have no error when task starts, user usually don't know they need to set the schema manually.

lance6716 · 2021-07-21T10:18:21Z

currently, we have almost no practical action to operate-schema set a correct schema before sync unit starts, so there's a change that in this sharding merge the WHERE clause of UDPATE and DELETE will modify unexpected records.

I think if we have no error when task starts, user usually don't know they need to set the schema manually.

yes, that proposal can only work with guiding from document.

lance6716 · 2021-07-27T05:46:29Z

closed by #1915

XuJianxu added type/bug This issue is a bug report severity/critical labels Jul 20, 2021

lance6716 closed this as completed Jul 21, 2021

lance6716 mentioned this issue Jul 21, 2021

Write conflict happened while upgrade dm V2.0.1 to nightly #1890

Closed

XuJianxu mentioned this issue Jul 21, 2021

operate-schema can set within task created process. #1899

Open

lance6716 added severity/major and removed severity/critical labels Jul 21, 2021

lance6716 reopened this Jul 21, 2021

GMHDBJD changed the title ~~Data decrease in shard table scenario after dm cluster upgrade~~ Data decrease in safemode when change auto-increment in downstream Jul 22, 2021

GMHDBJD changed the title ~~Data decrease in safemode when change auto-increment in downstream~~ Data decrease in shard-mode when change auto-increment in downstream Jul 22, 2021

lance6716 closed this as completed Jul 27, 2021

WizardXiao added a commit to WizardXiao/dm that referenced this issue Sep 23, 2021

commit-message: update the schema tracker core code about pingcap#1895

8f7a8cf

Ehco1996 mentioned this issue Sep 23, 2021

syncer: use downstream PK/UK to generate DML #2163

Closed

WizardXiao mentioned this issue Oct 29, 2021

dm/syncer: use downstream PK/UK to generate DML pingcap/tiflow#3168

Merged

ti-chi-bot mentioned this issue Nov 3, 2021

dm/syncer: use downstream PK/UK to generate DML (#3168) pingcap/tiflow#3256

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data decrease in shard-mode when change auto-increment in downstream #1895

Data decrease in shard-mode when change auto-increment in downstream #1895

XuJianxu commented Jul 20, 2021

lance6716 commented Jul 21, 2021 •

edited

Loading

GMHDBJD commented Jul 21, 2021

lance6716 commented Jul 21, 2021 •

edited

Loading

GMHDBJD commented Jul 21, 2021

lance6716 commented Jul 21, 2021

lance6716 commented Jul 27, 2021

Data decrease in shard-mode when change auto-increment in downstream #1895

Data decrease in shard-mode when change auto-increment in downstream #1895

Comments

XuJianxu commented Jul 20, 2021

Bug Report

lance6716 commented Jul 21, 2021 • edited Loading

GMHDBJD commented Jul 21, 2021

lance6716 commented Jul 21, 2021 • edited Loading

GMHDBJD commented Jul 21, 2021

lance6716 commented Jul 21, 2021

lance6716 commented Jul 27, 2021

lance6716 commented Jul 21, 2021 •

edited

Loading

lance6716 commented Jul 21, 2021 •

edited

Loading