-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow configuring driver threads based on the number of cores #22809
Conversation
|
|
||
long threads; | ||
if (value.endsWith(PER_CORE_SUFFIX)) { | ||
long multiplier = parseLong(value.substring(0, value.length() - PER_CORE_SUFFIX.length()).trim()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think there's any benefit in allowing floating point values? e.g. 1.25C
, 1.5C
, etc? It may introduce some additional complexity in that we'd need to round/floor non-integer results, but if you have a large machine, say 48 cores, you would be stuck with only having the option of using 48, 96, or even large multiples with no in-between. It seems like a pretty big jump and I think allowing a floating point value would allow some additional flexibility for those cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is. If IO takes on average less than 50% of scheduled time then it setting it to 2x the cores might be too much. Let me add support for fraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZacBlanco I saw a similar comment from the original PR and I think he made a valid point. Also when we do perf testings, the number of threads are usually multiple of the number of cores, and this number is usually 1 or 2. 2 is mostly used on Intel CPUs which have hyper-threading support. I think this change is fine and the main use case is to allow 1C instead of only 2C. On a system like Presto where CPU was never saturated, maybe higher number would help, we can revisit this if needed in the future.
presto-main/src/main/java/com/facebook/presto/execution/ThreadCountParser.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/execution/ThreadCountParser.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/execution/ThreadCountParser.java
Outdated
Show resolved
Hide resolved
I agree that most systems you either have hyperthreading or not, which is why you might choose to have only 1 or 2C available, so it makes sense. However, given that Presto has many other configurations (exchange client/server threads, etc) I can imagine a case where you may not want to always allocate 100% or 200% of available physical cores in order to reserve some cores to perform IO. As an example, on a 32-core machine you might want to allocate 0.9C for task execution, and leave the remaining cores available to use by exchange clients, configurable as 0.1C or something. This is one particular use case. I'm not sure what other threads pools the worker might use but if you want to preserve some core allocation for background tasks I think fractional values have a use case. On Java side I traced this particular configuration to see if the Footnotes |
da9d164
to
53bfe50
Compare
Thanks for the review everyone. Added support for fractional as I realized it may actually be useful for our deployment. Please take an another look. |
presto-main/src/main/java/com/facebook/presto/execution/ThreadCountParser.java
Outdated
Show resolved
Hide resolved
53bfe50
to
9d87ffc
Compare
Yeah these make sense, but Java Presto does not do async IO, and as you pointed out, exchange clients are a separate thread pool, so this config is only for the worker threads. But I agree that allowing fractions give us more flexibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for asking my opinion! Made a suggestion how to address the issue, and a couple nits about style. Looks good.
9d87ffc
to
de38234
Compare
Thanks for taking a look @steveburnett, updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! (docs)
Pull updated branch, new local docs build, everything looks good. Thanks!
Description
Allow configuring
task.max-worker-threads
as<multiplier>C
whereC
will be automatically resolved to the number of physical cores.Based on trinodb/trino#20772
Motivation and Context
It is not uncommon for clusters to consist of different CPU generation machines with a slightly different core count. For example Skylake Intel processor usually come with a smaller number of slightly faster cores, while similar copper lake CPUs come with higher number of slightly slower cores. Configuring a cluster to fixed number of cores may either underutilize one type of machines or underutilize the other.
Test Plan
Unit test
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.