-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tikv: log slow coprocessor task in detail #6344
Conversation
Add transaction startTS, storeAddr, ExecDetail to the log to make debug easier.
store/tikv/coprocessor.go
Outdated
) | ||
|
||
func (worker *copIteratorWorker) logTimeCopTask(costTime time.Duration, task *copTask, bo *Backoffer, resp *tikvrpc.Response) { | ||
logStr := fmt.Sprintf("[TIME_COP_TASK] cop_resp_time:%s, cop_start_ts:%d, cop_region_id:%d, cop_store_addr:%s", costTime, worker.req.StartTs, task.region.id, task.storeAddr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add time unit for the cop_resp_time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's time.Duration
type, which unit is included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/cop_start_ts/txnID/ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need to print the 'cop_' prefix
@AndreMouche PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
store/tikv/coprocessor.go
Outdated
) | ||
|
||
func (worker *copIteratorWorker) logTimeCopTask(costTime time.Duration, task *copTask, bo *Backoffer, resp *tikvrpc.Response) { | ||
logStr := fmt.Sprintf("[TIME_COP_TASK] cop_resp_time:%s, cop_start_ts:%d, cop_region_id:%d, cop_store_addr:%s", costTime, worker.req.StartTs, task.region.id, task.storeAddr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/cop_start_ts/txnID/ ?
store/tikv/coprocessor.go
Outdated
) | ||
|
||
func (worker *copIteratorWorker) logTimeCopTask(costTime time.Duration, task *copTask, bo *Backoffer, resp *tikvrpc.Response) { | ||
logStr := fmt.Sprintf("[TIME_COP_TASK] cop_resp_time:%s, cop_start_ts:%d, cop_region_id:%d, cop_store_addr:%s", costTime, worker.req.StartTs, task.region.id, task.storeAddr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need to print the 'cop_' prefix
if detail.HandleTime != nil { | ||
processMs := detail.HandleTime.ProcessMs | ||
waitMs := detail.HandleTime.WaitMs | ||
if processMs > minLogKVProcessTime { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndreMouche Does coprocessor also has a threshold to fill HandleTime.ProcessMs
and HandleTime.WaitMs
for slow cop requests ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YES
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coocood Maybe we just need to print HandleTime.ProcessMs
and HandleTime.WaitMs
once they are set, no need to use another thresholds like minLogKVProcessTime
and minLogKVWaitTime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After we set the HandleTime
and ScanDetial
to true
, every request will return them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When will we set ScanDetail
to true
? @coocood
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ScanDetail is not set to true now. it's set only when the duration exceeds the threshold(1s) in TiKV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we should remove these minXXXX
thresholds defined in this file and do not check them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TiDB should control what to log, without depending on TiKV's implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zz-jason If we log everything, the most important ones will be obscure.
@zz-jason PTAL |
/run-all-tests |
@zz-jason @AndreMouche PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
} else { | ||
detail = resp.Cop.ExecDetails | ||
} | ||
if detail != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are so many if
in if
.. Could we log.Info
through defer
function, and return directly if detail==nil
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think defer is used for release resource.
It looks weird if the normal execution code is done in defer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with Yuxing.
@shenli PTAL |
@@ -604,21 +605,22 @@ func (worker *copIteratorWorker) handleTaskOnce(bo *Backoffer, task *copTask, ch | |||
IsolationLevel: pbIsolationLevel(worker.req.IsolationLevel), | |||
Priority: kvPriorityToCommandPri(worker.req.Priority), | |||
NotFillCache: worker.req.NotFillCache, | |||
HandleTime: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like "EnableHandleTime" is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is defined in protobuf, hard to change.
store/tikv/coprocessor.go
Outdated
|
||
func appendScanDetail(logStr string, columnFamily string, scanInfo *kvrpcpb.ScanInfo) string { | ||
if scanInfo != nil { | ||
if scanInfo.Total > minLogScanTotalInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to check this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not log messages that is not interesting.
LGTM |
Add transaction startTS, storeAddr, ExecDetail to the log to make debug easier.