Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction amender will cause failure of drainer #20773

Closed
zyguan opened this issue Nov 2, 2020 · 4 comments
Closed

Transaction amender will cause failure of drainer #20773

zyguan opened this issue Nov 2, 2020 · 4 comments
Assignees
Labels
severity/major sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug.

Comments

@zyguan
Copy link
Contributor

zyguan commented Nov 2, 2020

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Execute following sql on upstream

/* INIT */ drop table if exists t;
/* INIT */ create table t (id int primary key, c_str varchar(20));
/* INIT */ insert into t values (1, '0001'), (2, '0002'), (3, null), (4, '0003'), (5, null);
/* TXN */ begin;
/* TXN */ insert into t values (6, '0004');
/* TXN */ insert into t values (7, null);
/* DDL */ alter table t add c_str_new varchar(20);
/* TXN */ update t set c_str = '0005' where id = 1;
/* TXN */ update t set c_str = null where id = 2;
/* TXN */ update t set c_str = '0006' where id = 3;
/* TXN */ delete from t where id = 4;
/* TXN */ delete from t where id = 5;
/* TXN */ commit;

DDL should be run in an another session.

2. What did you expect to see? (Required)

No error reported by drainer and data is synced to downstream.

3. What did you see instead (Required)

Drainer exited unexpectedly with error.

drainer log
[2020/11/01 13:46:17.149 +00:00] [INFO] [collector.go:305] ["get ddl job"] [job="ID:110, Type:add column, State:synced, SchemaState:public, SchemaID:1, TableID:108, RowCount:0, ArgLen:0, start time: 2020-11-01 13:46:15.719 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2020/11/01 13:46:17.149 +00:00] [INFO] [syncer.go:442] ["add ddl item to syncer, you can add this commit ts to `ignore-txn-commit-ts` to skip this ddl if needed"] [sql="alter table t add c_str_new varchar(20)"] ["commit ts"=420541464856231937]
[2020/11/01 13:46:17.149 +00:00] [INFO] [load.go:830] ["Loader has been closed. Start quitting txnManager"]
[2020/11/01 13:46:17.149 +00:00] [INFO] [load.go:820] ["run()... in txnManager quit"]
[2020/11/01 13:46:17.387 +00:00] [INFO] [load.go:423] ["exec ddl success"] [sql="alter table t add c_str_new varchar(20)"]
[2020/11/01 13:46:17.387 +00:00] [INFO] [load.go:593] ["Loader closed, quit running"]
[2020/11/01 13:46:17.387 +00:00] [INFO] [syncer.go:257] ["write save point"] [ts=420541464856231937]
[2020/11/01 13:46:17.387 +00:00] [INFO] [load.go:876] ["txnManager has been closed"]
[2020/11/01 13:46:17.387 +00:00] [INFO] [load.go:566] ["{16 20 0xc000109200 0xc00027cb60 false 1 true true false}"]
[2020/11/01 13:46:17.387 +00:00] [INFO] [load.go:567] ["Run()... in Loader quit"]
[2020/11/01 13:46:17.387 +00:00] [INFO] [mysql.go:233] ["Successes chan quit"]
[2020/11/01 13:46:17.390 +00:00] [INFO] [syncer.go:249] ["handleSuccess quit"]
[2020/11/01 13:46:17.390 +00:00] [ERROR] [server.go:289] ["syncer exited abnormal"] [error="failed to add item: row data is corrupted []"] [errorVerbose="row data is corrupted []\ngithub.com/pingcap/tidb-binlog/drainer/translator.DecodeOldAndNewRow\n\t/home/jenkins/agent/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/drainer/translator/translator.go:165\ngithub.com/pingcap/tidb-binlog/drainer/translator.updateDecoder.decode\n\t/home/jenkins/agent/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/drainer/translator/translator.go:209\ngithub.com/pingcap/tidb-binlog/drainer/translator.updateRowToRow\n\t/home/jenkins/agent/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/drainer/translator/kafka.go:193\ngithub.com/pingcap/tidb-binlog/drainer/translator.createTableMutation\n\t/home/jenkins/agent/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/drainer/translator/kafka.go:302\ngithub.com/pingcap/tidb-binlog/drainer/translator.nextRow\n\t/home/jenkins/agent/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/drainer/translator/kafka.go:324\ngithub.com/pingcap/tidb-binlog/drainer/translator.TiBinlogToSecondaryBinlog\n\t/home/jenkins/agent/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/drainer/translator/kafka.go:79\ngithub.com/pingcap/tidb-binlog/drainer/relay.(*relayer).WriteBinlog\n\t/home/jenkins/agent/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/drainer/relay/relayer.go:66\ngithub.com/pingcap/tidb-binlog/drainer/sync.(*MysqlSyncer).Sync\n\t/home/jenkins/agent/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/drainer/sync/mysql.go:179\ngithub.com/pingcap/tidb-binlog/drainer.(*Syncer).run\n\t/home/jenkins/agent/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/drainer/syncer.go:378\ngithub.com/pingcap/tidb-binlog/drainer.(*Syncer).Start\n\t/home/jenkins/agent/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/drainer/syncer.go:132\ngithub.com/pingcap/tidb-binlog/drainer.(*Server).Start.func4\n\t/home/jenkins/agent/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/drainer/server.go:288\ngithub.com/pingcap/tidb-binlog/drainer.(*taskGroup).start.func1\n\t/home/jenkins/agent/workspace/build_tidb_binlog_master/go/src/github.com/pingcap/tidb-binlog/drainer/util.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357\nfailed to add item"]
[2020/11/01 13:46:17.390 +00:00] [INFO] [util.go:72] [Exit] [name=syncer]
[2020/11/01 13:46:17.390 +00:00] [INFO] [server.go:451] ["begin to close drainer server"]
[2020/11/01 13:46:17.392 +00:00] [INFO] [server.go:416] ["has already update status"] [id=binlog-drainer:8249]
[2020/11/01 13:46:17.392 +00:00] [INFO] [server.go:455] ["commit status done"]
[2020/11/01 13:46:17.392 +00:00] [INFO] [pump.go:77] ["pump is closing"] [id=binlog-upstream-pump-0:8250]
[2020/11/01 13:46:17.392 +00:00] [INFO] [util.go:72] [Exit] [name=heartbeat]
[2020/11/01 13:46:17.392 +00:00] [INFO] [collector.go:135] ["publishBinlogs quit"]
[2020/11/01 13:46:17.392 +00:00] [INFO] [util.go:72] [Exit] [name=collect]
[2020/11/01 13:46:17.392 +00:00] [INFO] [server.go:470] ["drainer exit"]
[2020/11/01 13:46:17.392 +00:00] [INFO] [main.go:73] ["drainer exit"]
[2020/11/01 13:46:17.392 +00:00] [INFO] [server.go:322] ["drainer http server stopped"] [error="mux: listener closed"]

4. What is your TiDB version? (Required)

v4.0.8

@zyguan zyguan added the type/bug The issue is confirmed as a bug. label Nov 2, 2020
@wjhuang2016
Copy link
Member

@cfzjywxk PTAL

@cfzjywxk
Copy link
Contributor

cfzjywxk commented Nov 3, 2020

It's being solved in pingcap/tidb-binlog#1006.

@cfzjywxk
Copy link
Contributor

It should be solved in pingcap/tidb-binlog#1006.

@ti-srebot
Copy link
Contributor

ti-srebot commented Nov 19, 2020

Please edit this comment or add a new comment to complete the following information

Not a bug

  1. Remove the 'type/bug' label
  2. Add notes to indicate why it is not a bug

Duplicate bug

  1. Add the 'type/duplicate' label
  2. Add the link to the original bug

Bug

Note: Make Sure that 'component', and 'severity' labels are added
Example for how to fill out the template: #20100

1. Root Cause Analysis (RCA) (optional)

The transaction amend may cause the drainer failure.

2. Symptom (optional)

Error log in drainer.

3. All Trigger Conditions (optional)

4. Workaround (optional)

5. Affected versions

v4.0.7, v4.0.8, master

6. Fixed versions

v4.0.9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/major sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

5 participants