Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sink data to mysql5.7, the data is inconsistent #1585

Closed
minipeanut opened this issue Mar 30, 2021 · 6 comments · Fixed by #1638
Closed

sink data to mysql5.7, the data is inconsistent #1585

minipeanut opened this issue Mar 30, 2021 · 6 comments · Fixed by #1638
Labels
subject/new-feature Denotes an issue or pull request adding a new feature.

Comments

@minipeanut
Copy link

Bug Report

Please answer these questions before submitting your issue. Thanks!

  1. What did you do? If possible, provide a recipe for reproducing the error.

1.1

tiup cluster deploy tidb-38 v4.0.11 topo.yaml --user ubuntu -i cykey.pem
tiup cluster start tidb-38
tiup cluster patch tidb-38 ./cdc-nightly.tar.gz -N xx.xx.xx.xx:8300

1.2

tiup ctl:nightly cdc changefeed create --pd="http://xx.xx.xx.xx:2379" --changefeed-id test-cdc-38 --sink-uri="mysql://[email protected]:3306"

1.3

./runSQL.sh props.mysql sql.mysql/tableCreates.sql && ./runSQL.sh props.mysql
./runLoader.sh props.mysql

1.4

./bin/sync_diff_inspector --config=./config.toml

  1. What did you expect to see?
    the sync_diff_inspector shows that the data is consistent

  2. What did you see instead?
    [2021/03/30 10:57:03.280 +00:00] [WARN] [diff.go:893] ["find different row"] [column=ol_delivery_d] [row1="{ ol_supply_w_id: 1, ol_delivery_d: 2021-03-30 07:02:32, ol_amount: 9283.38, ol_o_id: 2182, ol_number: 7, ol_i_id: 7762, ol_quantity: 5, ol_dist_info: AyhG99XdFJ3HbhRBjtFhM5PU, ol_w_id: 1, ol_d_id: 4, }"] [row2="{ ol_o_id: 2182, ol_i_id: 7762, ol_delivery_d: IsNull, ol_quantity: 5, ol_supply_w_id: 1, ol_dist_info: AyhG99XdFJ3HbhRBjtFhM5PU, ol_w_id: 1, ol_d_id: 4, ol_number: 7, ol_amount: 9283.38, }"]
    [2021/03/30 10:57:03.280 +00:00] [WARN] [diff.go:893] ["find different row"] [column=ol_delivery_d] [row1="{ ol_o_id: 2538, ol_number: 3, ol_i_id: 12445, ol_supply_w_id: 1, ol_quantity: 5, ol_w_id: 1, ol_d_id: 1, ol_delivery_d: 2021-03-30 07:01:53, ol_amount: 512.30, ol_dist_info: lCs1KXiiBwi4SswcrjCkD7tH, }"] [row2="{ ol_number: 3, ol_delivery_d: IsNull
    cdc.log
    , ol_amount: 512.30, ol_d_id: 1, ol_o_id: 2538, ol_supply_w_id: 1, ol_quantity: 5, ol_dist_info: lCs1KXiiBwi4SswcrjCkD7tH, ol_w_id: 1, ol_i_id: 12445, }"]
    [2021/03/30 10:57:03.280 +00:00] [INFO] [diff.go:730] ["[update]"] [sql="REPLACE INTO tpcc.bmsql_order_line(ol_w_id,ol_d_id,ol_o_id,ol_number,ol_i_id,ol_delivery_d,ol_amount,ol_supply_w_id,ol_quantity,ol_dist_info) VALUES (1,1,2538,3,12445,'2021-03-30 07:01:53',512.30,1,5,'lCs1KXiiBwi4SswcrjCkD7tH');"]

  3. Versions of the cluster

    4.0.11

    • TiCDC version (execute cdc version):

      Release Version: v5.0.0-master
      Git Commit Hash: adcc9c6
      Git Branch: heads/refs/tags/v5.0.0-nightly
      UTC Build Time: 2021-03-27 16:50:21

cdc.log
Go Version: go version go1.13 linux/amd64

@minipeanut minipeanut added the type/bug The issue is confirmed as a bug. label Mar 30, 2021
@amyangfei
Copy link
Contributor

  • create table DDL: create table bmsql_order_line (ol_w_id integer not null,ol_d_id integer not null,ol_o_id integer not null,ol_number integer not null,ol_i_id integer not null,ol_delivery_d timestamp,ol_amount decimal(6,2),ol_supply_w_id integer,ol_quantity integer,ol_dist_info char(24),constraint pk_order_line primary key (ol_w_id, ol_d_id, ol_o_id, ol_number));
  • the sql_mode in downstream MySQL: ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
  • so the DDL executed result in downstream will be
mysql> show create table bmsql_order_line\G
*************************** 1. row ***************************
       Table: bmsql_order_line
Create Table: CREATE TABLE `bmsql_order_line` (
  `ol_w_id` int(11) NOT NULL,
  `ol_d_id` int(11) NOT NULL,
  `ol_o_id` int(11) NOT NULL,
  `ol_number` int(11) NOT NULL,
  `ol_i_id` int(11) NOT NULL,
  `ol_delivery_d` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `ol_amount` decimal(6,2) DEFAULT NULL,
  `ol_supply_w_id` int(11) DEFAULT NULL,
  `ol_quantity` int(11) DEFAULT NULL,
  `ol_dist_info` char(24) DEFAULT NULL,
  PRIMARY KEY (`ol_w_id`,`ol_d_id`,`ol_o_id`,`ol_number`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.29 sec)
  • The default value for ol_delivery_d is CURRENT_TIMESTAMP, so the NULL data from upstream will become CURRENT_TIMESTAMP

@amyangfei
Copy link
Contributor

amyangfei commented Mar 30, 2021

Since the sql_mode differs between upstream and downstream, the different of table structure and data is as expected. If user wants to keep the same table structure and data, the sql_mode should be the same between upstream and downstream. SO IMHO this is not a critical bug.

Besides TiCDC does not provide a sql-mode option, it uses the default sql-mode in downstream. Maybe we can optimize this.

Such as

  • provide a sql-mode option in sink-uri
  • check the precondition before changefeed starts, like the precheck in DM

We should also add some FAQ about this problem, since user could meet and feel confused.

  • Add FAQ about sql-mode related table structure and data inconsistent

@amyangfei
Copy link
Contributor

@jebter PTAL, IMHO this is not a bug and maybe the feature-request label is more suitable.

@jebter jebter added subject/new-feature Denotes an issue or pull request adding a new feature. and removed severity/critical type/bug The issue is confirmed as a bug. labels Mar 30, 2021
@Tammyxia
Copy link

Tammyxia commented Apr 6, 2021

Failed config:

  1. upstream tidb 4.0.11, downstream mysql 5.7.
  2. upstream tidb 5.0.0, downstream mysql 5.7.

Passed config:

  1. upstream tidb 5.0.5, downstream mysql 8.0.

@Tammyxia
Copy link

The root cause is:
Default value of explicit_defaults_for_timestamp:
The default value in TiDB is ON and currently TiDB only supports ON.
The default value in MySQL:
For MySQL 5.7: OFF.
For MySQL 8.0: ON.
https://docs.pingcap.com/tidb/stable/mysql-compatibility#mysql-compatibility

TiDB timestamp behavior:
image

MySQL 5.7 behavior:
image

@minipeanut
Copy link
Author

i try this in tidb:

insert into t_date values(1, now(), curtime(), '2020-01-01', '2020');
...
insert into t_date values(4, now(), "2021-04-13 09:05:32", '2020-01-01', '2020');
insert into t_date values(5, now(), "2021-04-13 09:05:32", '2020-01-01', '2020');

tidb shows
image

mysql 5.7 shows
image

timezone in tidb
image

timezone in mysql
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
subject/new-feature Denotes an issue or pull request adding a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants