-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
domain: close slow query channel after closing session pool #7847
Conversation
If slow query channel is closed before session pool, some session's goroutine may still writing to the channel. Writing to a closed channel would cause TiDB panic.
PTAL @crazycs520 @winkyao @coocood |
do.sysSessionPool.Close() | ||
do.slowQuery.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it enough to put “do.slowQuery.Close()” after line462?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do.slowQuery.Close()
will make goroutine exit and WaitGroup count -1.
If this line is moved to line 462,do.wg.Wait()
would wait forever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignore it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think it's useless. After this function is executed, "updateStatsWorker" may still be called by another goroutine. This problem still exists.
Maybe we can replace "ctx"(the updateStatsWorker's argument) with the session created in the sysSessionPool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean after session pool is closed, updateStatsWorker
will be still executing ? @zimulala
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ye. Because the session
used in statistic bootstrap is not get from session pool
. @tiancaiamao
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current session pool implementation will block waiting resource to be put back, so after the pool is closed, there won't be system session running. @crazycs520 @zimulala
domain/topn_slow_query.go
Outdated
done := false | ||
for !done { | ||
select { | ||
case <-q.ch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to drain the channel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
mu struct { | ||
sync.RWMutex | ||
closed bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about use atomic
instead of mutex
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
atomic can't protect multiple operations. @crazycs520
modify atomic flag
close(ch)
check atomic flag
ch <-
Take this order for example:
- check atomic flag = success
- modify atomic flag = closed
- close(ch)
- <-ch panic!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@zimulala PTAL |
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What problem does this PR solve?
If slow query channel is closed before session pool, some session's goroutine may still
writing to the channel. Writing to a closed channel would cause TiDB panic.
What is changed and how it works?
domain.Close()
is called, there will be no more session executing.Check List
Related changes