add index task always running after tidb rolling restart #50073
Labels
affects-7.5
This bug affects the 7.5.x(LTS) versions.
component/ddl
This issue is related to DDL of TiDB.
feature/developing
the related feature is in development
severity/major
type/bug
The issue is confirmed as a bug.
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
1、tidb_enable_dist_task='on' and enable global sort
2、run workload
3、add index
4、tidb rolling restart(rolling restart start time:2024-01-04 11:17:04)
2. What did you expect to see? (Required)
add index can success
3. What did you see instead (Required)
add index always running after tidb rolling restart
the status of ddl job is not synced after 1h0m0s (now: 2024-01-04 12:16:09, jobId: 572, job type: add index /* ingest cloud /, state: running)
operatorLogs:
[2024-01-04 11:16:04] ###### start adding index
alter table sbtest1 add index index_test_1704338164805 (c)
[2024-01-04 11:16:04] ###### wait for ddl job finish
[2024-01-04 12:16:09] ###### wait for ddl job finish timeout(1h0m0s)
select job_id, job_type, state from information_schema.ddl_jobs where query = 'alter table sbtest1 add index index_test_1704338164805 (c)'
jobId: 572, job type: add index / ingest cloud */, state: running
tidb logs:
endless-ha-test-add-index-tps-5401458-1-842.tar.gz
[2024/01/04 11:19:48.644 +08:00] [WARN] [scheduler.go:564] ["generate part of subtasks failed"] [task-id=210070] [task-type=backfill] [error="backend context not found"]
[2024/01/04 11:19:48.644 +08:00] [WARN] [scheduler.go:639] ["generate plan failed"] [task-id=210070] [task-type=backfill] [error="backend context not found"] [state=running]
[2024/01/04 11:19:48.644 +08:00] [INFO] [scheduler.go:216] ["schedule task meet err, reschedule it"] [task-id=210070] [task-type=backfill] [error="backend context not found"]
[2024/01/04 11:19:49.137 +08:00] [INFO] [scheduler.go:540] ["on next step"] [task-id=210070] [task-type=backfill] [current-step=2] [next-step=3]
[2024/01/04 11:19:49.137 +08:00] [INFO] [scheduler.go:557] ["eligible instances"] [task-id=210070] [task-type=backfill] [num=1]
[2024/01/04 11:19:49.138 +08:00] [INFO] [backfilling_dist_scheduler.go:90] ["on next subtasks batch"] [type=backfill] [task-id=210070] [curr-step=merge-sort] [next-step=write&ingest]
[2024/01/04 11:19:49.141 +08:00] [INFO] [s3.go:407] ["succeed to get bucket region from s3"] ["bucket region"=us-east-1]
[2024/01/04 11:19:49.141 +08:00] [WARN] [scheduler.go:564] ["generate part of subtasks failed"] [task-id=210070] [task-type=backfill] [error="backend context not found"]
[2024/01/04 11:19:49.141 +08:00] [WARN] [scheduler.go:639] ["generate plan failed"] [task-id=210070] [task-type=backfill] [error="backend context not found"] [state=running]
[2024/01/04 11:19:49.141 +08:00] [INFO] [scheduler.go:216] ["schedule task meet err, reschedule it"] [task-id=210070] [task-type=backfill] [error="backend context not found"]
[2024/01/04 11:19:49.637 +08:00] [INFO] [scheduler.go:540] ["on next step"] [task-id=210070] [task-type=backfill] [current-step=2] [next-step=3]
[2024/01/04 11:19:49.637 +08:00] [INFO] [scheduler.go:557] ["eligible instances"] [task-id=210070] [task-type=backfill] [num=1]
[2024/01/04 11:19:49.638 +08:00] [INFO] [backfilling_dist_scheduler.go:90] ["on next subtasks batch"] [type=backfill] [task-id=210070] [curr-step=merge-sort] [next-step=write&ingest]
[2024/01/04 11:19:49.642 +08:00] [INFO] [s3.go:407] ["succeed to get bucket region from s3"] ["bucket region"=us-east-1]
[2024/01/04 11:19:49.642 +08:00] [WARN] [scheduler.go:564] ["generate part of subtasks failed"] [task-id=210070] [task-type=backfill] [error="backend context not found"]
4. What is your TiDB version? (Required)
./tidb-server -V
Release Version: v7.6.0-alpha
Edition: Community
Git Commit Hash: 1f49f2d
Git Branch: heads/refs/tags/v7.6.0-alpha
UTC Build Time: 2024-01-03 11:44:08
GoVersion: go1.21.5
Race Enabled: false
Check Table Before Drop: false
Store: unistore
2024-01-04T11:15:57.092+0800
The text was updated successfully, but these errors were encountered: