-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](scanner) Fix incorrect _max_thread_num in scanner context #40569
Conversation
zhiqiang-hhhh
commented
Sep 9, 2024
•
edited
Loading
edited
- Minor refactor for scanner constructor, calculation of _max_thread_num is moved to init method
- The expected value of _max_thread_num is changed. There is no need to submit too many scan task to scan scheduler, since thread num is limited.
- Calculation of _max_bytes_in_queue is changed. _max_bytes_in_queue for each scan instance is limited to 100MB by default.
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
run buildall |
run buildall |
TeamCity be ut coverage result: |
TPC-H: Total hot run time: 37813 ms
|
TPC-DS: Total hot run time: 192190 ms
|
ClickBench: Total hot run time: 31.31 s
|
run cloud_p0 |
run buildall |
// scan_queue_mem_limit on FE is 100MB by default, on backend we will make sure its actual value | ||
// is larger than 10MB. | ||
_max_bytes_in_queue = | ||
std::max(_state->scan_queue_mem_limit(), (int64_t)1024 * 1024 * 1024 * 10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是10MB, 不是10gb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
TeamCity be ut coverage result: |
@@ -117,7 +117,7 @@ Status ScanLocalState<Derived>::open(RuntimeState* state) { | |||
RETURN_IF_ERROR(status); | |||
if (_scanner_ctx) { | |||
DCHECK(!_eos && _num_scanners->value() > 0); | |||
RETURN_IF_ERROR(_scanner_ctx->init()); | |||
RETURN_IF_ERROR(_scanner_ctx->init(p.ignore_data_distribution())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可能把,这个参数,作为scanner context的构造函数的参数,传递给scanner context,这样你往2.1 pick的时候会比较方便。否则2.1 在多种pipeline 模型下,代码不好pick
run buildall |
run buildall |
TeamCity be ut coverage result: |
TPC-H: Total hot run time: 39300 ms
|
TPC-DS: Total hot run time: 197893 ms
|
ClickBench: Total hot run time: 31.46 s
|
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
run performance |
TPC-H: Total hot run time: 39569 ms
|
TPC-DS: Total hot run time: 193174 ms
|
ClickBench: Total hot run time: 30.65 s
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…he#40569) 1. Minor refactor for scanner constructor, calculation of _max_thread_num is moved to init method 2. The expected value of _max_thread_num is changed. There is no need to submit too many scan task to scan scheduler, since thread num is limited. 3. Calculation of _max_bytes_in_queue is changed. _max_bytes_in_queue for each scan instance is limited to 100MB by default.
…xt (apache#40569)" This reverts commit a912f77.
…xt (apache#40569)" This reverts commit a912f77.
1. Minor refactor for scanner constructor, calculation of _max_thread_num is moved to init method 2. The expected value of _max_thread_num is changed. There is no need to submit too many scan task to scan scheduler, since thread num is limited. 3. Calculation of _max_bytes_in_queue is changed. _max_bytes_in_queue for each scan instance is limited to 100MB by default.