-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spool: fix data race when to exit #42129
Conversation
Signed-off-by: Weizhen Wang <[email protected]>
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
/check-issue-triage-complete |
Signed-off-by: Weizhen Wang <[email protected]>
resourcemanager/pool/spool/spool.go
Outdated
@@ -142,7 +145,12 @@ func (p *Pool) RunWithConcurrency(fns chan func(), concurrency uint32) error { | |||
|
|||
// checkAndAddRunning is to check if a task can run. If can, add the running number. | |||
func (p *Pool) checkAndAddRunning(concurrency uint32) (conc int32, run bool) { | |||
p.waiting.Add(1) | |||
defer p.waiting.Add(-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the waiting variable modification into Run? Because it may call p.wg.Add(1)
after this checking and the check(for p.waiting.Load() > 0 {...}
) in ReleaseAndWait
may pass after checkAndAddRunning
and before p.wg.Add(1)
, then there is still possible race.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right.
resourcemanager/pool/spool/spool.go
Outdated
@@ -173,6 +181,10 @@ func (p *Pool) checkAndAddRunningInternal(concurrency int32) (conc int32, run bo | |||
// ReleaseAndWait releases the pool and waits for all tasks to be completed. | |||
func (p *Pool) ReleaseAndWait() { | |||
p.isStop.Store(true) | |||
// wait for all the task in the pending to exit | |||
for p.waiting.Load() > 0 { | |||
time.Sleep(waitInterval) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to confirm the usage of ReleaseAndWait
. If it may be called in Executor.Close
, 5ms
sleep is unacceptable. If it's called when TiDB exiting or kvstore closing, this LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remove 5ms
sleep first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A spinlock is risky because it may loop here while the Run
or RunWithConcurrency
threads are scheduled out, why not use a condition variable here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/merge |
This pull request has been accepted and is ready to merge. Commit hash: 6277e79
|
What problem does this PR solve?
Issue Number: close #42130
Problem Summary:
when we are in exit mode, Some tasks still want to start a new goroutine and call
waitgroup.Add
. but at the same time, we are calling thewaitgroup.Wait
.the data race will happen.
What is changed and how it works?
Adding the logic of checking the exit mode when to check the running exit and wait to make all tasks called
waitgroup.Add
.Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.