-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: [1215 main tke regression] sysbench update report 'FATAL: Worker threads failed to initialize within 30 seconds'. #20765
Comments
same issue to #18725 |
初始化超时的问题,不一定和 S3 连接数过多问题相关。 连接数超过1万的有两个时间,一个是 UTC 12-17 17:00,一个是 12-18 06:00:https://grafana.ci.matrixorigin.cn/goto/JxW1H1INR?orgId=1 对应的时间段,做的测试,都是 TPCH,而且没有出错: 日志里也没有 cannot assign requested address 的错误:https://grafana.ci.matrixorigin.cn/goto/_5zUH1IHR?orgId=1 结论是,Worker threads failed to initialize within 30 seconds 这个测试工具侧的报错,和 S3 连接数过多的问题不等价。 |
S3 连接数过多的问题的一个修正,没有包含在上述流程:3c10894 |
下面以 mo-main-nightly-e791f9335-20241217 为案例分析 Worker threads failed to initialize within 30 seconds 的问题:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/12375243410/job/34540441574 第一次报告这个错误,是 UTC 12-17 22:17:32:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/12375243410/job/34560148100 查看对应时段的日志:https://grafana.ci.matrixorigin.cn/goto/_ZFSD1SHR?orgId=1 去掉 "INFO",以及 "use of closed network connection":https://grafana.ci.matrixorigin.cn/goto/EKncvJIHR?orgId=1 剩余的错误,包含以下几个:
再搜索其他时间段内的 HAKeeper 错误:https://grafana.ci.matrixorigin.cn/goto/QbO2d1INR?orgId=1 ,似乎经常出现 综上所述,从日志未看出异常。 |
猜测只是因为负载增高,来不及响应所以客户端超时。 需要修改客户端超时时间,来验证是否正确。 |
客户端超时时间已改长,继续观察 |
继续观察 |
1 similar comment
继续观察 |
https://github.com/matrixorigin/mo-nightly-regression/actions/workflows/nightly-regression-tke-new.yaml |
sysbench添加 --thread-init-timeout=180 将默认30seconds更改为180sec,持续观察2周后没有再出现该问题,closed |
Is there an existing issue for the same bug?
Branch Name
main
Commit ID
0d92298
Other Environment Information
Actual Behavior
job url: https://github.com/matrixorigin/mo-nightly-regression/actions/runs/12330700463/job/34421856358
log: https://grafana.ci.matrixorigin.cn/explore?panes=%7B%22ONj%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-main-nightly-0d9229833-20241214%5C%22%7D%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221734214924000%22,%22to%22:%221734215289000%22%7D%7D%7D&schemaVersion=1&orgId=1
Expected Behavior
No response
Steps to Reproduce
Additional information
No response
The text was updated successfully, but these errors were encountered: