Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Found empty schema diff in the non-latest schema version #8578

Closed
JaySon-Huang opened this issue Dec 23, 2023 · 3 comments · Fixed by #8623
Closed

Found empty schema diff in the non-latest schema version #8578

JaySon-Huang opened this issue Dec 23, 2023 · 3 comments · Fixed by #8623
Labels
affects-7.5 This bug affects the 7.5.x(LTS) versions. component/storage report/customer Customers have encountered this bug. severity/major type/bug The issue is confirmed as a bug.

Comments

@JaySon-Huang
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

fullstack-test2-logs.tar.gz

Related logging in TiFlash:

  1. TiFlash try to sync the schema diff in (258, 266]
  2. SchemaDiff with 259 is update tiflash replica status, apply done
  3. SchemaDiff with 260 is empty, but it is not expected.
[2023/12/22 21:42:37.322 +08:00] [INFO] [TiDBSchemaSyncer.cpp:73] ["Start to sync schemas. current version is: 258 and try to sync schema version to: 266"] [source="keyspace=4294967295"] [thread_id=302]
[2023/12/22 21:42:37.322 +08:00] [TRACE] [SchemaGetter.cpp:256] ["Get SchemaDiff from TiKV, schema_version=259 data={\"version\":259,\"type\":31,\"schema_id\":2,\"table_id\":295,\"old_table_id\":0,\"old_schema_id\":0,\"regenerate_schema_map\":false,\"affected_options\":null}"] [thread_id=302]
[2023/12/22 21:42:37.323 +08:00] [TRACE] [SchemaBuilder.cpp:246] ["applyDiff accept type=UpdateTiFlashReplicaStatus"] [source="keyspace=4294967295"] [thread_id=302]
[2023/12/22 21:42:37.323 +08:00] [WARN] [SchemaGetter.cpp:300] ["The table is dropped in TiKV, try to get the latest table_info, table_id=295"] [thread_id=302]
[2023/12/22 21:42:37.324 +08:00] [DEBUG] [SchemaGetter.cpp:313] ["Get Table Info from TiKV, table_id=295 {...}"] [thread_id=302]
[2023/12/22 21:42:37.333 +08:00] [DEBUG] [StorageDeltaMerge.cpp:1523] ["Update table_info: {...}"] [source=db_2.t_296] [thread_id=302]
[2023/12/22 21:42:37.340 +08:00] [INFO] [SchemaBuilder.cpp:458] ["Updating replica info, replica count old=1 new=1 available=optional(true) physical_table_id=296 logical_table_id=295"] [source="keyspace=4294967295"] [thread_id=302]
[2023/12/22 21:42:37.344 +08:00] [DEBUG] [StorageDeltaMerge.cpp:1523] ["Update table_info: {...}"] [source=db_2.t_297] [thread_id=302]
[2023/12/22 21:42:37.351 +08:00] [INFO] [SchemaBuilder.cpp:458] ["Updating replica info, replica count old=1 new=1 available=optional(true) physical_table_id=297 logical_table_id=295"] [source="keyspace=4294967295"] [thread_id=302]
[2023/12/22 21:42:37.357 +08:00] [DEBUG] [StorageDeltaMerge.cpp:1523] ["Update table_info: {...}"] [source=db_2.t_295] [thread_id=302]
[2023/12/22 21:42:37.357 +08:00] [INFO] [StorageDeltaMerge.cpp:128] ["updateTableColumnInfo, table_name=t_295 ordinary=..."] [source=db_2.t_295] [thread_id=302]
[2023/12/22 21:42:37.364 +08:00] [INFO] [SchemaBuilder.cpp:484] ["Updating replica info, replica count old=1 new=1 available=optional(true) physical_table_id=295 logical_table_id=295"] [source="keyspace=4294967295"] [thread_id=302]
[2023/12/22 21:42:37.365 +08:00] [WARN] [SchemaGetter.cpp:253] ["The schema diff is empty, schema_version=260 key=Diff:260"] [thread_id=302]
[2023/12/22 21:42:37.367 +08:00] [ERROR] [BaseDaemon.cpp:367] [########################################] [source=BaseDaemon] [thread_id=1157]
[2023/12/22 21:42:37.367 +08:00] [ERROR] [BaseDaemon.cpp:368] ["(from thread 302) Received signal Aborted(6)."] [source=BaseDaemon] [thread_id=1157]
[2023/12/22 21:42:37.460 +08:00] [ERROR] [BaseDaemon.cpp:560] ["\n       0xc6fd95b\tfaultSignalHandler(int, siginfo_t*, void*) [tiflash+208656731]\n  0x7fbca4ee9630\t<unknown symbol> [libpthread.so.0+63024]\n  0x7fbca4728387\tgsignal [libc.so.6+222087]\n  0x7fbca4729a78\t__GI_abort [libc.so.6+227960]\n  0x7fbcb188566f\tstd::__1::__libcpp_abort_debug_function(std::__1::__libcpp_debug_info const&) [libc++.so.1+464495]\n       0xdf05366\tstd::__1::optional<DB::SchemaDiff>::operator->() [tiflash+233853798]\n       0xdf05025\tDB::TiDBSchemaSyncer<false, false>::syncSchemaDiffs(DB::Context&, DB::SchemaGetter&, long) [tiflash+233852965]\n       0xdf03709\tDB::TiDBSchemaSyncer<false, false>::syncSchemasByGetter(DB::Context&, DB::SchemaGetter&) [tiflash+233846537]\n       0xdf043b1\tDB::TiDBSchemaSyncer<false, false>::syncTableSchema(DB::Context&, long) [tiflash+233849777]\n       0xc9f1fb8\tDB::TiDBSchemaSyncerManager::syncTableSchema(DB::Context&, unsigned int, long) [tiflash+211754936]\n       0xe9bc27b\tDB::AtomicGetStorageSchema(std::__1::shared_ptr<DB::Region> const&, DB::TMTContext&) [tiflash+245088891]\n       0xe8b8d20\tDB::KVStore::preHandleSSTsToDTFiles(std::__1::shared_ptr<DB::Region>, DB::SSTViewVec, unsigned long, unsigned long, DB::DM::FileConvertJobType, DB::TMTContext&) [tiflash+244026656]\n       0xe8b85f3\tDB::KVStore::preHandleSnapshotToFiles(std::__1::shared_ptr<DB::Region>, DB::SSTViewVec, unsigned long, unsigned long, std::__1::optional<unsigned long>, DB::TMTContext&) [tiflash+244024819]\n       0xe8898d1\tPreHandleSnapshot [tiflash+243833041]\n  0x7fbcae58e736\tproxy_ffi::engine_store_helper_impls::_$LT$impl$u20$proxy_ffi..interfaces..root..DB..EngineStoreServerHelper$GT$::pre_handle_snapshot::h840d9ed14f09a65e [libtiflash_proxy.so+150529846]\n  0x7fbcae502dfd\tengine_store_ffi::core::forward_raft::snapshot::pre_handle_snapshot_impl::h61aa39310c31a62e [libtiflash_proxy.so+149958141]\n  0x7fbcab8c0dc0\tengine_store_ffi::core::forward_raft::snapshot::_$LT$impl$u20$engine_store_ffi..core..forwarder..ProxyForwarder$LT$T$C$ER$GT$$GT$::pre_apply_snapshot::_$u7b$$u7b$closure$u7d$$u7d$::h2f5ffc9e49c879e8 [libtiflash_proxy.so+103550400]\n  0x7fbcaa772ab2\t_$LT$core..future..from_generator..GenFuture$LT$T$GT$$u20$as$u20$core..future..future..Future$GT$::poll::h09a365f0be04fce9 [libtiflash_proxy.so+85404338]\n  0x7fbcaab1a639\tyatp::task::future::RawTask$LT$F$GT$::poll::h86e08f0661bfeb04 [libtiflash_proxy.so+89237049]\n  0x7fbcaf99307e\tyatp::task::future::TaskCell::poll::hbbb766a97468e2df [libtiflash_proxy.so+171520126]\n  0x7fbcaf993e34\t_$LT$yatp..task..future..Runner$u20$as$u20$yatp..pool..runner..Runner$GT$::handle::h3c329648fd40d6ed [libtiflash_proxy.so+171523636]\n  0x7fbcaf98e503\tyatp::pool::worker::WorkerThread$LT$T$C$R$GT$::run::hd8dbb26e2d7bf247 [libtiflash_proxy.so+171500803]\n  0x7fbcaf95521d\tyatp::pool::builder::LazyBuilder$LT$T$GT$::build::_$u7b$$u7b$closure$u7d$$u7d$::h514ea08a6145fb22 [libtiflash_proxy.so+171266589]\n  0x7fbcaf9499ac\tstd::sys_common::backtrace::__rust_begin_short_backtrace::h077f138196384345 [libtiflash_proxy.so+171219372]\n  0x7fbcaf9813dd\tstd::thread::Builder::spawn_unchecked_::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::hb9a4f6f6f0cd6ab0 [libtiflash_proxy.so+171447261]\n  0x7fbcaf958371\t_$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h5b4b03ee84d96aa4 [libtiflash_proxy.so+171279217]\n  0x7fbcaf981606\tstd::panicking::try::do_call::h825e38935db99a94 [libtiflash_proxy.so+171447814]\n  0x7fbcaf9948cb\t__rust_try [libtiflash_proxy.so+171526347]\n  0x7fbcaf98152f\tstd::panicking::try::hcf8a41405db4db12 [libtiflash_proxy.so+171447599]\n  0x7fbcaf97f9e1\tstd::panic::catch_unwind::h96e08252e1e218de [libtiflash_proxy.so+171440609]\n  0x7fbcaf9811aa\tstd::thread::Builder::spawn_unchecked_::_$u7b$$u7b$closure$u7d$$u7d$::h1aaaf3ff102ab795 [libtiflash_proxy.so+171446698]\n  0x7fbcaf982f77\tcore::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h9122f2f9f9bffe80 [libtiflash_proxy.so+171454327]\n  0x7fbcb0eebf93\tstd::sys::unix::thread::Thread::new::thread_start::hd2791a9cabec1fda [libtiflash_proxy.so+193904531]\n                \t/rustc/96ddd32c4bfb1d78f0cd03eb068b1710a8cebeef/library/std/src/sys/unix/thread.rs:108\n  0x7fbca4ee1ea5\tstart_thread [libpthread.so.0+32421]"] [source=BaseDaemon] [thread_id=1157]

4. What is your TiFlash version? (Required)

master

@JaySon-Huang JaySon-Huang added the type/bug The issue is confirmed as a bug. label Dec 23, 2023
@JaySon-Huang
Copy link
Contributor Author

Seem TiDB break this assumption:

// After the feature concurrent DDL, TiDB does `update schema version` before `set schema diff`, and they are done in separate transactions.
// So TiFlash may see a schema version X but no schema diff X, meaning that the transaction of schema diff X has not been committed or has
// been aborted.
// However, TiDB makes sure that if we get a schema version X, then the schema diff X-1 must exist. Otherwise the transaction of schema diff
// X-1 is aborted and we can safely ignore it.
// Since TiDB can not make sure the schema diff of the latest schema version X is not empty, under this situation we should set the `cur_version`
// to X-1 and try to fetch the schema diff X next time.

@JaySon-Huang
Copy link
Contributor Author

Related logging of TiDB.
Seems there are some "resolveLock rollback" and "handle ddl job failed" in TiDB DDL

[2023/12/22 13:42:18.088 +00:00] [INFO] [ddl_worker.go:253] ["add DDL jobs"] [category=ddl] ["batch count"=1] [jobs="ID:300, Type:update tiflash replica status, State:queueing, SchemaState:none, SchemaID:2, TableID:295, RowCount:0, ArgLen:2, start time: 2023-12-22 13:42:18.071 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0; "] [table=true]
[2023/12/22 13:42:18.088 +00:00] [INFO] [ddl.go:1068] ["start DDL job"] [category=ddl] [job="ID:300, Type:update tiflash replica status, State:queueing, SchemaState:none, SchemaID:2, TableID:295, RowCount:0, ArgLen:2, start time: 2023-12-22 13:42:18.071 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"] [query=]
[2023/12/22 13:42:18.098 +00:00] [INFO] [ddl_worker.go:1000] ["run DDL job"] [worker="worker 12, tp general"] [category=ddl] [jobID=300] [category=ddl] [job="ID:300, Type:update tiflash replica status, State:queueing, SchemaState:none, SchemaID:2, TableID:295, RowCount:0, ArgLen:0, start time: 2023-12-22 13:42:18.071 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/12/22 13:42:18.120 +00:00] [INFO] [domain.go:272] ["diff load InfoSchema success"] [currentSchemaVersion=258] [neededSchemaVersion=259] ["start time"=2.234146ms] [gotSchemaVersion=259] [phyTblIDs="[]"] [actionTypes="[]"] [diffTypes="[]"]
[2023/12/22 13:42:18.123 +00:00] [INFO] [domain.go:873] ["mdl gets lock, update to owner"] [jobID=300] [version=259]
[2023/12/22 13:42:18.168 +00:00] [INFO] [ddl_worker.go:1234] ["wait latest schema version changed(get the metadata lock if tidb_enable_metadata_lock is true)"] [category=ddl] [ver=259] ["take time"=53.192986ms] [job="ID:300, Type:update tiflash replica status, State:done, SchemaState:public, SchemaID:2, TableID:295, RowCount:0, ArgLen:2, start time: 2023-12-22 13:42:18.071 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/12/22 13:42:18.179 +00:00] [INFO] [ddl_worker.go:615] ["finish DDL job"] [worker="worker 12, tp general"] [category=ddl] [jobID=300] [job="ID:300, Type:update tiflash replica status, State:synced, SchemaState:public, SchemaID:2, TableID:295, RowCount:0, ArgLen:0, start time: 2023-12-22 13:42:18.071 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/12/22 13:42:18.186 +00:00] [INFO] [ddl.go:1170] ["DDL job is finished"] [category=ddl] [jobID=300]
[2023/12/22 13:42:18.187 +00:00] [INFO] [callback.go:139] ["performing DDL change, must reload"]
[2023/12/22 13:42:19.085 +00:00] [INFO] [manager.go:353] ["get owner"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 4e36f0a2-1a11-4c0a-be02-157235b7c63a"] ["owner key"=/tidb/ddl/fg/owner/4cf68c91bdcafc02] [ownerID=4e36f0a2-1a11-4c0a-be02-157235b7c63a] [op=none]
[2023/12/22 13:42:20.109 +00:00] [INFO] [manager.go:353] ["get owner"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 4e36f0a2-1a11-4c0a-be02-157235b7c63a"] ["owner key"=/tidb/ddl/fg/owner/4cf68c91bdcafc02] [ownerID=4e36f0a2-1a11-4c0a-be02-157235b7c63a] [op=none]
[2023/12/22 13:42:20.189 +00:00] [INFO] [ddl_tiflash_api.go:495] ["Escape checking available status due to backoff"] [tableId=296]
[2023/12/22 13:42:21.132 +00:00] [INFO] [manager.go:353] ["get owner"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 4e36f0a2-1a11-4c0a-be02-157235b7c63a"] ["owner key"=/tidb/ddl/fg/owner/4cf68c91bdcafc02] [ownerID=4e36f0a2-1a11-4c0a-be02-157235b7c63a] [op=none]
[2023/12/22 13:42:22.155 +00:00] [INFO] [manager.go:353] ["get owner"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 4e36f0a2-1a11-4c0a-be02-157235b7c63a"] ["owner key"=/tidb/ddl/fg/owner/4cf68c91bdcafc02] [ownerID=4e36f0a2-1a11-4c0a-be02-157235b7c63a] [op=none]
[2023/12/22 13:42:22.193 +00:00] [INFO] [ddl_tiflash_api.go:528] ["Tiflash replica is available"] [tableID=296] [progress=1]
[2023/12/22 13:42:22.203 +00:00] [INFO] [ddl_worker.go:253] ["add DDL jobs"] [category=ddl] ["batch count"=1] [jobs="ID:301, Type:update tiflash replica status, State:queueing, SchemaState:none, SchemaID:2, TableID:295, RowCount:0, ArgLen:2, start time: 2023-12-22 13:42:22.172 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0; "] [table=true]
[2023/12/22 13:42:22.203 +00:00] [INFO] [ddl.go:1068] ["start DDL job"] [category=ddl] [job="ID:301, Type:update tiflash replica status, State:queueing, SchemaState:none, SchemaID:2, TableID:295, RowCount:0, ArgLen:2, start time: 2023-12-22 13:42:22.172 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"] [query=]
[2023/12/22 13:42:22.214 +00:00] [INFO] [ddl_worker.go:1000] ["run DDL job"] [worker="worker 12, tp general"] [category=ddl] [jobID=301] [category=ddl] [job="ID:301, Type:update tiflash replica status, State:queueing, SchemaState:none, SchemaID:2, TableID:295, RowCount:0, ArgLen:0, start time: 2023-12-22 13:42:22.172 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/12/22 13:42:22.216 +00:00] [INFO] [table.go:1474] ["TiFlash replica available"] [tableID=295]
[2023/12/22 13:42:23.173 +00:00] [INFO] [manager.go:353] ["get owner"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 4e36f0a2-1a11-4c0a-be02-157235b7c63a"] ["owner key"=/tidb/ddl/fg/owner/4cf68c91bdcafc02] [ownerID=4e36f0a2-1a11-4c0a-be02-157235b7c63a] [op=none]
[2023/12/22 13:42:24.196 +00:00] [INFO] [manager.go:353] ["get owner"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 4e36f0a2-1a11-4c0a-be02-157235b7c63a"] ["owner key"=/tidb/ddl/fg/owner/4cf68c91bdcafc02] [ownerID=4e36f0a2-1a11-4c0a-be02-157235b7c63a] [op=none]
[2023/12/22 13:42:25.246 +00:00] [INFO] [lock_resolver.go:1132] ["resolveLock rollback"] [lock="key: 748000fffffffffffe5f72800000000000012d, primary: 6d44423a3200000000fb00000000000000685461626c653a3239ff3500000000000000f8, txnStartTS: 446497434415136779, lockForUpdateTS:0, minCommitTs:446497434415136780, ttl: 3009, type: Put, UseAsyncCommit: false, txnSize: 1"]
[2023/12/22 13:42:25.257 +00:00] [INFO] [manager.go:353] ["get owner"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 4e36f0a2-1a11-4c0a-be02-157235b7c63a"] ["owner key"=/tidb/ddl/fg/owner/4cf68c91bdcafc02] [ownerID=4e36f0a2-1a11-4c0a-be02-157235b7c63a] [op=none]
[2023/12/22 13:42:25.527 +00:00] [INFO] [lock_resolver.go:1132] ["resolveLock rollback"] [lock="key: 6d446966663a323630ff0000000000000000f70000000000000073, primary: 6d44423a3200000000fb00000000000000685461626c653a3239ff3500000000000000f8, txnStartTS: 446497434415136779, lockForUpdateTS:0, minCommitTs:446497434415136780, ttl: 3009, type: Put, UseAsyncCommit: false, txnSize: 2"]
[2023/12/22 13:42:26.345 +00:00] [INFO] [coprocessor.go:1332] ["[TIME_COP_WAIT] resp_time:1.935521808s txnStartTS:446497434991853571 region_id:107 store_addr:tikv0:20160 kv_process_ms:0 kv_wait_ms:1934 kv_read_ms:0 processed_versions:0 total_versions:1 rocksdb_delete_skipped_count:0 rocksdb_key_skipped_count:0 rocksdb_cache_hit_count:0 rocksdb_read_count:0 rocksdb_read_byte:0"]
[2023/12/22 13:42:26.910 +00:00] [INFO] [manager.go:353] ["get owner"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 4e36f0a2-1a11-4c0a-be02-157235b7c63a"] ["owner key"=/tidb/ddl/fg/owner/4cf68c91bdcafc02] [ownerID=4e36f0a2-1a11-4c0a-be02-157235b7c63a] [op=none]
[2023/12/22 13:42:27.721 +00:00] [INFO] [coprocessor.go:1332] ["[TIME_COP_WAIT] resp_time:615.391868ms txnStartTS:446497435699380225 region_id:107 store_addr:tikv0:20160 kv_process_ms:0 kv_wait_ms:0 kv_read_ms:0 processed_versions:0 total_versions:1 rocksdb_delete_skipped_count:0 rocksdb_key_skipped_count:0 rocksdb_cache_hit_count:0 rocksdb_read_count:0 rocksdb_read_byte:0"]
[2023/12/22 13:42:27.722 +00:00] [WARN] [session.go:900] ["can not retry txn"] [label=internal] [error="[kv:8022]Error: KV error safe to retry Error(Txn(Error(Mvcc(Error(TxnLockNotFound { start_ts: TimeStamp(446497434415136779), commit_ts: TimeStamp(446497435856666626), key: [109, 68, 66, 58, 50, 0, 0, 0, 0, 251, 0, 0, 0, 0, 0, 0, 0, 104, 84, 97, 98, 108, 101, 58, 50, 57, 255, 53, 0, 0, 0, 0, 0, 0, 0, 248] }))))) {metaKey=true, key=DB:2, field=Table:295} [try again later]"] [IsBatchInsert=false] [IsPessimistic=false] [InRestrictedSQL=true] [tidb_retry_limit=10] [tidb_disable_txn_auto_retry=true]
[2023/12/22 13:42:27.722 +00:00] [WARN] [session.go:916] ["commit failed"] ["finished txn"="Txn{state=invalid}"] [error="[kv:8022]Error: KV error safe to retry Error(Txn(Error(Mvcc(Error(TxnLockNotFound { start_ts: TimeStamp(446497434415136779), commit_ts: TimeStamp(446497435856666626), key: [109, 68, 66, 58, 50, 0, 0, 0, 0, 251, 0, 0, 0, 0, 0, 0, 0, 104, 84, 97, 98, 108, 101, 58, 50, 57, 255, 53, 0, 0, 0, 0, 0, 0, 0, 248] }))))) {metaKey=true, key=DB:2, field=Table:295} [try again later]"]
[2023/12/22 13:42:27.722 +00:00] [INFO] [job_table.go:428] ["handle ddl job failed"] [category=ddl] [error="[kv:8022]Error: KV error safe to retry Error(Txn(Error(Mvcc(Error(TxnLockNotFound { start_ts: TimeStamp(446497434415136779), commit_ts: TimeStamp(446497435856666626), key: [109, 68, 66, 58, 50, 0, 0, 0, 0, 251, 0, 0, 0, 0, 0, 0, 0, 104, 84, 97, 98, 108, 101, 58, 50, 57, 255, 53, 0, 0, 0, 0, 0, 0, 0, 248] }))))) {metaKey=true, key=DB:2, field=Table:295} [try again later]"] [job="ID:301, Type:update tiflash replica status, State:done, SchemaState:public, SchemaID:2, TableID:295, RowCount:0, ArgLen:2, start time: 2023-12-22 13:42:22.172 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/12/22 13:42:27.726 +00:00] [INFO] [ddl_worker.go:1000] ["run DDL job"] [worker="worker 12, tp general"] [category=ddl] [jobID=301] [category=ddl] [job="ID:301, Type:update tiflash replica status, State:queueing, SchemaState:none, SchemaID:2, TableID:295, RowCount:0, ArgLen:0, start time: 2023-12-22 13:42:22.172 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/12/22 13:42:27.727 +00:00] [INFO] [table.go:1474] ["TiFlash replica available"] [tableID=295]
[2023/12/22 13:42:27.738 +00:00] [INFO] [domain.go:272] ["diff load InfoSchema success"] [currentSchemaVersion=259] [neededSchemaVersion=261] ["start time"=1.740914ms] [gotSchemaVersion=261] [phyTblIDs="[]"] [actionTypes="[]"] [diffTypes="[]"]
[2023/12/22 13:42:27.740 +00:00] [INFO] [domain.go:873] ["mdl gets lock, update to owner"] [jobID=301] [version=261]
[2023/12/22 13:42:27.787 +00:00] [INFO] [ddl_worker.go:1234] ["wait latest schema version changed(get the metadata lock if tidb_enable_metadata_lock is true)"] [category=ddl] [ver=261] ["take time"=52.325222ms] [job="ID:301, Type:update tiflash replica status, State:done, SchemaState:public, SchemaID:2, TableID:295, RowCount:0, ArgLen:2, start time: 2023-12-22 13:42:22.172 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/12/22 13:42:27.795 +00:00] [INFO] [ddl_worker.go:615] ["finish DDL job"] [worker="worker 12, tp general"] [category=ddl] [jobID=301] [job="ID:301, Type:update tiflash replica status, State:synced, SchemaState:public, SchemaID:2, TableID:295, RowCount:0, ArgLen:0, start time: 2023-12-22 13:42:22.172 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/12/22 13:42:27.801 +00:00] [INFO] [ddl.go:1170] ["DDL job is finished"] [category=ddl] [jobID=301]
[2023/12/22 13:42:27.802 +00:00] [INFO] [callback.go:139] ["performing DDL change, must reload"]
[2023/12/22 13:42:29.930 +00:00] [INFO] [manager.go:353] ["get owner"] ["owner info"="[ddl] /tidb/ddl/fg/owner ownerManager 4e36f0a2-1a11-4c0a-be02-157235b7c63a"] ["owner key"=/tidb/ddl/fg/owner/4cf68c91bdcafc02] [ownerID=4e36f0a2-1a11-4c0a-be02-157235b7c63a] [op=none]
[2023/12/22 13:42:29.952 +00:00] [INFO] [batch_coprocessor.go:571] ["detecting available mpp stores"] [total=1] [alive=0]
[2023/12/22 13:42:29.952 +00:00] [INFO] [batch_coprocessor.go:940] ["retry for TiFlash peer with region missing"] ["region id"=419]
[2023/12/22 13:42:30.018 +00:00] [WARN] [batch_coprocessor.go:341] ["Meet regions that don't have an available store. Give up balancing"]
[2023/12/22 13:42:30.018 +00:00] [INFO] [local_mpp_coordinator.go:216] ["Dispatch mpp task"] [timestamp=446497436446752775] [ID=1] [QueryTs=1703252549952755642] [LocalQueryId=330] [ServerID=1885] [address=tiflash0:3930] [plan="Table(t)->HashAgg->Send(-1, )"] [mpp-version=2] [exchange-compression-mode=NONE] [GatherID=1] [resource_group=default]
[2023/12/22 13:42:31.924 +00:00] [WARN] [mpp_probe.go:246] ["Store is not ready"] ["store address"=tiflash0:3930] ["err message"="rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
[2023/12/22 13:42:32.611 +00:00] [INFO] [coprocessor.go:1332] ["[TIME_COP_WAIT] resp_time:1.005105057s txnStartTS:446497436879028226 region_id:107 store_addr:tikv0:20160 kv_process_ms:0 kv_wait_ms:0 kv_read_ms:0 processed_versions:0 total_versions:0 rocksdb_delete_skipped_count:0 rocksdb_key_skipped_count:0 rocksdb_cache_hit_count:0 rocksdb_read_count:0 rocksdb_read_byte:0"]
[2023/12/22 13:42:32.612 +00:00] [INFO] [coprocessor.go:1332] ["[TIME_COP_WAIT] resp_time:583.874322ms txnStartTS:446497436996993026 region_id:47 store_addr:tikv0:20160 kv_process_ms:0 kv_wait_ms:0 kv_read_ms:0 processed_versions:0 total_versions:850 rocksdb_delete_skipped_count:196 rocksdb_key_skipped_count:2287 rocksdb_cache_hit_count:0 rocksdb_read_count:0 rocksdb_read_byte:0"]
[2023/12/22 13:42:33.595 +00:00] [INFO] [coprocessor.go:1332] ["[TIME_COP_WAIT] resp_time:1.689924377s txnStartTS:446497436957671425 region_id:107 store_addr:tikv0:20160 kv_process_ms:0 kv_wait_ms:1689 kv_read_ms:0 processed_versions:0 total_versions:1 rocksdb_delete_skipped_count:0 rocksdb_key_skipped_count:0 rocksdb_cache_hit_count:0 rocksdb_read_count:0 rocksdb_read_byte:0"]
[2023/12/22 13:42:33.595 +00:00] [INFO] [coprocessor.go:1332] ["[TIME_COP_WAIT] resp_time:789.826305ms txnStartTS:446497437193863169 region_id:107 store_addr:tikv0:20160 kv_process_ms:0 kv_wait_ms:0 kv_read_ms:0 processed_versions:0 total_versions:0 rocksdb_delete_skipped_count:0 rocksdb_key_skipped_count:0 rocksdb_cache_hit_count:0 rocksdb_read_count:0 rocksdb_read_byte:0"]
[2023/12/22 13:42:33.595 +00:00] [INFO] [coprocessor.go:1332] ["[TIME_COP_WAIT] resp_time:1.567733186s txnStartTS:446497436996993026 region_id:47 store_addr:tikv0:20160 kv_process_ms:0 kv_wait_ms:0 kv_read_ms:0 processed_versions:0 total_versions:850 rocksdb_delete_skipped_count:196 rocksdb_key_skipped_count:2287 rocksdb_cache_hit_count:0 rocksdb_read_count:0 rocksdb_read_byte:0"]
[2023/12/22 13:42:33.750 +00:00] [INFO] [batch_coprocessor.go:571] ["detecting available mpp stores"] [total=1] [alive=0]
[2023/12/22 13:42:33.750 +00:00] [INFO] [batch_coprocessor.go:940] ["retry for TiFlash peer with region missing"] ["region id"=14]
[2023/12/22 13:42:33.837 +00:00] [WARN] [batch_coprocessor.go:341] ["Meet regions that don't have an available store. Give up balancing"]
[2023/12/22 13:42:33.838 +00:00] [INFO] [local_mpp_coordinator.go:216] ["Dispatch mpp task"] [timestamp=446497437442637826] [ID=1] [QueryTs=1703252553750138678] [LocalQueryId=331] [ServerID=1885] [address=tiflash0:3930] [plan="Table(t)->HashAgg->Send(-1, )"] [mpp-version=2] [exchange-compression-mode=NONE] [GatherID=1] [resource_group=default]
[2023/12/22 13:42:33.986 +00:00] [INFO] [save.go:200] ["incrementally update modifyCount"] [category=stats] [tableID=297] [curModifyCnt=0] [results.BaseModifyCnt=0] [modifyCount=0]
[2023/12/22 13:42:33.986 +00:00] [INFO] [save.go:222] ["directly update count"] [category=stats] [tableID=297] [results.Count=1] [count=1]
[2023/12/22 13:42:34.047 +00:00] [INFO] [analyze.go:745] ["analyze table `test`.`t` has finished"] [partition=p1] ["job info"="analyze table all columns with 256 buckets, 500 topn, 1 samplerate"] ["start time"=2023/12/22 13:42:33.978 +00:00] ["end time"=2023/12/22 13:42:34.044 +00:00] [cost=66.110784ms] ["sample rate reason"="Row count in stats_meta is much smaller compared with the row count got by PD, use min(1, 15000/1) as the sample-rate=1"]
[2023/12/22 13:42:34.110 +00:00] [INFO] [save.go:200] ["incrementally update modifyCount"] [category=stats] [tableID=296] [curModifyCnt=0] [results.BaseModifyCnt=0] [modifyCount=0]
[2023/12/22 13:42:34.110 +00:00] [INFO] [save.go:222] ["directly update count"] [category=stats] [tableID=296] [results.Count=16384] [count=16384]
[2023/12/22 13:42:34.165 +00:00] [INFO] [analyze.go:745] ["analyze table `test`.`t` has finished"] [partition=p0] ["job info"="analyze table all columns with 256 buckets, 500 topn, 1 samplerate"] ["start time"=2023/12/22 13:42:33.978 +00:00] ["end time"=2023/12/22 13:42:34.162 +00:00] [cost=183.824226ms] ["sample rate reason"="Row count in stats_meta is much smaller compared with the row count got by PD, use min(1, 15000/16384) as the sample-rate=1"]
[2023/12/22 13:42:35.305 +00:00] [INFO] [batch_coprocessor.go:571] ["detecting available mpp stores"] [total=1] [alive=0]
[2023/12/22 13:42:35.305 +00:00] [WARN] [batch_coprocessor.go:341] ["Meet regions that don't have an available store. Give up balancing"]
[2023/12/22 13:42:35.305 +00:00] [INFO] [local_mpp_coordinator.go:216] ["Dispatch mpp task"] [timestamp=446497437848961027] [ID=1] [QueryTs=1703252555305385808] [LocalQueryId=332] [ServerID=1885] [address=tiflash0:3930] [plan="Table(t)->HashAgg->Send(-1, )"] [mpp-version=2] [exchange-compression-mode=NONE] [GatherID=1] [resource_group=default]
[2023/12/22 13:42:35.382 +00:00] [WARN] [local_mpp_coordinator.go:639] ["other error"] [txnStartTS=446497437848961027] [storeAddr=tiflash0:3930] [mpp-version=2] [task-id=1] [error="other error for mpp stream: Code: 0, e.displayText() = DB::Exception: code: 10007\nmsg: \"Fail point FailPoints::cop_send_failure is triggered.\"\n, e.what() = DB::Exception,"]
[2023/12/22 13:42:35.385 +00:00] [INFO] [conn.go:1132] ["command dispatched failed"] [conn=3953132560] [session_alias=] [connInfo="id:3953132560, addr:172.19.0.5:47146 status:10, collation:utf8_general_ci, user:root"] [command=Query] [status="inTxn:0, autocommit:1"] [sql="select count(*) from test.t"] [txn_mode=PESSIMISTIC] [timestamp=446497437848961027] [err="other error for mpp stream: Code: 0, e.displayText() = DB::Exception: code: 10007\nmsg: \"Fail point FailPoints::cop_send_failure is triggered.\"\n, e.what() = DB::Exception,\ngithub.com/pingcap/tidb/pkg/executor/internal/mpp.(*localMppCoordinator).handleMPPStreamResponse\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/internal/mpp/local_mpp_coordinator.go:635\ngithub.com/pingcap/tidb/pkg/executor/internal/mpp.(*localMppCoordinator).receiveResults\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/internal/mpp/local_mpp_coordinator.go:513\ngithub.com/pingcap/tidb/pkg/executor/internal/mpp.(*localMppCoordinator).handleDispatchReq\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/internal/mpp/local_mpp_coordinator.go:466\ngithub.com/pingcap/tidb/pkg/executor/internal/mpp.(*localMppCoordinator).dispatchAll.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/internal/mpp/local_mpp_coordinator.go:371\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650"]
[2023/12/22 13:42:36.307 +00:00] [INFO] [session.go:3890] ["CRUCIAL OPERATION"] [conn=3953132562] [schemaVersion=261] [cur_db=] [sql="drop table if exists test.t"] [user=root@%]
[2023/12/22 13:42:36.317 +00:00] [INFO] [ddl_worker.go:253] ["add DDL jobs"] [category=ddl] ["batch count"=1] [jobs="ID:302, Type:drop table, State:queueing, SchemaState:public, SchemaID:2, TableID:295, RowCount:0, ArgLen:2, start time: 2023-12-22 13:42:36.272 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0; "] [table=true]
[2023/12/22 13:42:36.317 +00:00] [INFO] [ddl.go:1068] ["start DDL job"] [category=ddl] [job="ID:302, Type:drop table, State:queueing, SchemaState:public, SchemaID:2, TableID:295, RowCount:0, ArgLen:2, start time: 2023-12-22 13:42:36.272 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"] [query="drop table if exists test.t"]
[2023/12/22 13:42:36.326 +00:00] [INFO] [ddl_worker.go:1000] ["run DDL job"] [worker="worker 12, tp general"] [category=ddl] [jobID=302] [conn=3953132562] [category=ddl] [job="ID:302, Type:drop table, State:queueing, SchemaState:public, SchemaID:2, TableID:295, RowCount:0, ArgLen:0, start time: 2023-12-22 13:42:36.272 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/12/22 13:42:36.337 +00:00] [INFO] [domain.go:272] ["diff load InfoSchema success"] [currentSchemaVersion=261] [neededSchemaVersion=262] ["start time"=532.533µs] [gotSchemaVersion=262] [phyTblIDs="[295,296,297]"] [actionTypes="[4,4,4]"] [diffTypes="[\"drop table\"]"]
[2023/12/22 13:42:36.340 +00:00] [INFO] [domain.go:873] ["mdl gets lock, update to owner"] [jobID=302] [version=262]
[2023/12/22 13:42:36.386 +00:00] [INFO] [ddl_worker.go:1234] ["wait latest schema version changed(get the metadata lock if tidb_enable_metadata_lock is true)"] [category=ddl] [ver=262] ["take time"=51.974298ms] [job="ID:302, Type:drop table, State:running, SchemaState:write only, SchemaID:2, TableID:295, RowCount:0, ArgLen:2, start time: 2023-12-22 13:42:36.272 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/12/22 13:42:36.395 +00:00] [INFO] [ddl_worker.go:1000] ["run DDL job"] [worker="worker 12, tp general"] [category=ddl] [jobID=302] [conn=3953132562] [category=ddl] [job="ID:302, Type:drop table, State:running, SchemaState:write only, SchemaID:2, TableID:295, RowCount:0, ArgLen:0, start time: 2023-12-22 13:42:36.272 +0000 UTC, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/12/22 13:42:36.406 +00:00] [INFO] [domain.go:272] ["diff load InfoSchema success"] [currentSchemaVersion=262] [neededSchemaVersion=263] ["start time"=476.122µs] [gotSchemaVersion=263] [phyTblIDs="[]"] [actionTypes="[]"] [diffTypes="[\"drop table\"]"]

@JaySon-Huang
Copy link
Contributor Author

The logic in release-7.1 is skip the non-latest empty SchemaDiff.

SchemaBuilder<Getter, NameMapper> builder(getter, context, databases, used_version);
try
{
for (size_t diff_index = 0; diff_index < diffs.size(); ++diff_index)
{
const auto & schema_diff = diffs[diff_index];
if (!schema_diff)
{
// If `schema diff` got empty `schema diff`(it's not the latest one, due to we check it before), we should just skip it.
//
// example:
// - `cur_version` is 1, `latest_version` is 10
// - The schema diff of schema version [2,4,6] is empty, Then we just skip it.
// - The schema diff of schema version 10 is empty, Then we should just apply version into 9(which we check it before)
LOG_WARNING(log, "Skip the schema diff from version {}. ", cur_version + diff_index + 1);
continue;
}
if (schema_diff->regenerate_schema_map)
{
// If `schema_diff.regenerate_schema_map` == true, return `-1` direclty, let TiFlash reload schema info from TiKV.
LOG_INFO(log, "Meets a schema diff with regenerate_schema_map flag");
return -1;
}
builder.applyDiff(*schema_diff);
}
}

So this bug was introduced by #7437 since 7.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-7.5 This bug affects the 7.5.x(LTS) versions. component/storage report/customer Customers have encountered this bug. severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant