Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One TTL task losing heartbeat will block other tasks from getting heartbeat #57915

Closed
YangKeao opened this issue Dec 3, 2024 · 1 comment · Fixed by #57919
Closed

One TTL task losing heartbeat will block other tasks from getting heartbeat #57915

YangKeao opened this issue Dec 3, 2024 · 1 comment · Fixed by #57919
Labels
affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. severity/moderate sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.

Comments

@YangKeao
Copy link
Member

YangKeao commented Dec 3, 2024

See the following codes:

// updateHeartBeat updates the heartbeat for all tasks with current instance as owner
func (m *taskManager) updateHeartBeat(ctx context.Context, se session.Session, now time.Time) error {
	for _, task := range m.runningTasks {
		state := &cache.TTLTaskState{
			TotalRows:   task.statistics.TotalRows.Load(),
			SuccessRows: task.statistics.SuccessRows.Load(),
			ErrorRows:   task.statistics.ErrorRows.Load(),
		}
		if task.result != nil && task.result.err != nil {
			state.ScanTaskErr = task.result.err.Error()
		}

		intest.Assert(se.GetSessionVars().Location().String() == now.Location().String())
		sql, args, err := updateTTLTaskHeartBeatSQL(task.JobID, task.ScanID, now, state, m.id)
		if err != nil {
			return err
		}
		_, err = se.ExecuteSQL(ctx, sql, args...)
		if err != nil {
			return errors.Wrapf(err, "execute sql: %s", sql)
		}

		if se.GetSessionVars().StmtCtx.AffectedRows() != 1 {
			return errors.Errorf("fail to update task status, maybe the owner is not myself (%s), affected rows: %d",
				m.id, se.GetSessionVars().StmtCtx.AffectedRows())
		}
	}
	return nil
}

it'll return error when one of the tasks fail to update. However, we should log the error and continue to try the next task.

@YangKeao YangKeao added the type/bug The issue is confirmed as a bug. label Dec 3, 2024
@YangKeao
Copy link
Member Author

YangKeao commented Dec 3, 2024

Job has similar issue, but it will not occur in normal cases, so it's not a big problem for the job manager. I should also fix it.

@jebter jebter added the sig/sql-infra SIG: SQL Infra label Dec 4, 2024
@ti-chi-bot ti-chi-bot bot closed this as completed in 0392cdd Dec 4, 2024
@YangKeao YangKeao added affects-8.5 This bug affects the 8.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. labels Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. severity/moderate sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants