-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20483] Mesos Coarse mode may starve other Mesos frameworks #17786
Conversation
Set max cores to a multiple of the smallestExecutorSize
Jenkins, add to whitelist. |
Jenkins, ok to test. |
Test build #76234 has finished for PR 17786 at commit
|
LGTM. Merged into master and branch 2.2. Thanks. |
## What changes were proposed in this pull request? Set maxCores to be a multiple of the smallest executor that can be launched. This ensures that we correctly detect the condition where no more executors will be launched when spark.cores.max is not a multiple of spark.executor.cores ## How was this patch tested? This was manually tested with other sample frameworks measuring their incoming offers to determine if starvation would occur. dbtsai mgummelt Author: Davis Shepherd <[email protected]> Closes #17786 from dgshep/fix_mesos_max_cores. (cherry picked from commit 7633933) Signed-off-by: DB Tsai <[email protected]>
…s frameworks ## What changes were proposed in this pull request? Add test case for scenarios where executor.cores is set as a (non)divisor of spark.cores.max This tests the change in #17786 ## How was this patch tested? Ran the existing test suite with the new tests dbtsai Author: Davis Shepherd <[email protected]> Closes #17788 from dgshep/add_mesos_test. (cherry picked from commit 039e32c) Signed-off-by: DB Tsai <[email protected]>
…s frameworks ## What changes were proposed in this pull request? Add test case for scenarios where executor.cores is set as a (non)divisor of spark.cores.max This tests the change in apache#17786 ## How was this patch tested? Ran the existing test suite with the new tests dbtsai Author: Davis Shepherd <[email protected]> Closes apache#17788 from dgshep/add_mesos_test.
Tests are added in a followup PR. #17788 |
private val maxCores = maxCoresOption.getOrElse(Int.MaxValue) | ||
private val maxCores = { | ||
val cores = maxCoresOption.getOrElse(Int.MaxValue) | ||
// Set maxCores to a multiple of smallest executor we can launch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have a log here to let users know?
I agree with @mgummelt, maxCores should have reflected user config option. |
Fair point. This felt like a succinct way to handle this corner case, but if it makes sense to harden the offer refusal code instead, I can update. |
@mgummelt We tested this in our production env, and it solves our issue. Since it seems to be a trivial change, I made my judgement. Gonna wait for more feedback. Thanks. |
Set maxCores to be a multiple of the smallest executor that can be launched. This ensures that we correctly detect the condition where no more executors will be launched when spark.cores.max is not a multiple of spark.executor.cores This was manually tested with other sample frameworks measuring their incoming offers to determine if starvation would occur. dbtsai mgummelt Author: Davis Shepherd <[email protected]> Closes apache#17786 from dgshep/fix_mesos_max_cores.
What changes were proposed in this pull request?
Set maxCores to be a multiple of the smallest executor that can be launched. This ensures that we correctly detect the condition where no more executors will be launched when spark.cores.max is not a multiple of spark.executor.cores
How was this patch tested?
This was manually tested with other sample frameworks measuring their incoming offers to determine if starvation would occur.
@dbtsai @mgummelt