[SPARK-17759] [CORE] Avoid adding duplicate schedulables #15326

erenavsarogullari · 2016-10-02T12:17:14Z

What changes were proposed in this pull request?

If spark.scheduler.allocation.file has duplicate pools, all of them are created when SparkContext is initialized but just one of them is used and the other ones look redundant. This causes redundant pool creation and needs to be fixed.

Code to Reproduce :

val conf = new SparkConf().setAppName("spark-fairscheduler").setMaster("local")
conf.set("spark.scheduler.mode", "FAIR")
conf.set("spark.scheduler.allocation.file", "src/main/resources/fairscheduler-duplicate-pools.xml")
val sc = new SparkContext(conf)

fairscheduler-duplicate-pools.xml :

The following sample just shows two default and duplicate_pool1 but this also needs to be thought for N default and/or other duplicate pools.

<allocations>
    <pool name="default">
        <minShare>0</minShare>
        <weight>1</weight>
        <schedulingMode>FIFO</schedulingMode>
    </pool>
    <pool name="default">
        <minShare>0</minShare>
        <weight>1</weight>
        <schedulingMode>FIFO</schedulingMode>
    </pool>
    <pool name="duplicate_pool1">
        <minShare>1</minShare>
        <weight>1</weight>
        <schedulingMode>FAIR</schedulingMode>
    </pool>
    <pool name="duplicate_pool1">
        <minShare>2</minShare>
        <weight>2</weight>
        <schedulingMode>FAIR</schedulingMode>
    </pool>
</allocations>

Debug Screenshot :
The following screenshots show Pool.schedulableQueue(ConcurrentLinkedQueue[Schedulable]) has 4 pools as

default, default, duplicate_pool1 and duplicate_pool1

but Pool.schedulableNameToSchedulable(ConcurrentHashMap[String, Schedulable]) has

default and duplicate_pool1

due to pool name as key so one of default and duplicate_pool1 look redundant and live in Pool.schedulableQueue.

## How was this patch tested?

Added new Unit Test case.

markhamstra · 2016-10-03T17:55:34Z

I've got some issues with this PR.

You've unnecessarily changed the prior behavior where the last added pool wins. We should preserve that behavior in case anyone was relying upon it to handle their broken configuration.
Adding duplicate TaskSetManagers (or other Schedulable types in the future) would also be an error, so I'd rather do the checking and de-duplication in addSchedulable instead of in buildFairSchedulerPool. Like this in Pool:

  override def addSchedulable(schedulable: Schedulable) {
    require(schedulable != null)
    val name = schedulable.name
    if (null == schedulableNameToSchedulable.put(name, schedulable)) {
      schedulableQueue.add(schedulable)
    } else {
      logWarning(s"Duplicate Schedulable added: $name")
      // remove previously enqueued schedulable with same name
      schedulableQueue.remove(getSchedulableByName(name))
      if (!schedulableQueue.contains(schedulable)) {
        schedulableQueue.add(schedulable)
      }
    }
    schedulable.parent = this
  }

The only routes to this code are from the one-time initialization of the backend on the creation of a SparkContext and from schedulableBuilder.addTaskSetManager in TaskSchedulerImpl#submitTasks, which is synchronized -- so the above should be fine in terms of thread safety.

This is also the only route by which a Schedulable can be added to schedulableQueue, so with this code there should never be more than one prior instance of a queued Schedulable with the same name that needs to be removed before replacing it with the duplicate (which matches the schedulableNameToSchedulable entry).

This should pass all the existing tests; but if you do something more severe than just logging the warning (such as adding a System.exit to that branch), you'll find that we are actually doing some abusive adding of duplicate schedulables within some tests.

@kayousterhout

kayousterhout · 2016-10-03T21:28:16Z

+1 to Mark's suggestion about putting the behavior in Pool::addSchedulable, so that it works across all Schedulables.

I'm less convinced about maintaining the existing behavior where the last added pool wins. I think people usually expect first-wins behavior for these things (e.g., that's the case for Java classpath issues), and the code is also simpler for that approach. Given that this should be affecting pretty few people (and there's now a warning logged), I think this would be fine to change.

markhamstra · 2016-10-03T21:33:13Z

@kayousterhout @rxin Ok, but if we're going to change the behavior, then we need to be sure that change at least makes it into the release notes.

kayousterhout · 2016-10-03T22:40:25Z

@markhamstra if we do change it, we should prob merge only into master and not 2.0.1.

markhamstra · 2016-10-03T22:42:07Z

Yes, that is what I was thinking regardless.

markhamstra · 2016-10-03T22:43:12Z

...and it would be 2.0.2 at this point. :)

erenavsarogullari · 2016-10-03T23:07:32Z

Firstly, thanks @markhamstra and @kayousterhout for quick feedbacks.

I was aware of putting the check in Pool.addSchedulable to cover for all Schedulables (Pool and TaskSetManager for now). My concern was that TaskSetManager uses the following naming-pattern:

var name = "TaskSet_" + taskSet.stageId.toString

and if there is a case that a stage having multiple TaskSet/TaskSetManager, this could be a problem. However, your comments already clarified this so i agree moving to Pool.addSchedulable and address with next commit ;)

markhamstra · 2016-10-03T23:23:25Z

@erenavsarogullari Your concern was entirely legitimate, and is also why I called in @kayousterhout to double-check my claim that other duplicate Schedulables would also be a problem.

kayousterhout · 2016-10-03T23:36:18Z

@erenavsarogullari @markhamstra I just looked at this further and I actually think this could be an issue:

(1) The first attempt for a stage fails with a fetch failure. The associated TaskSetManager is marked as a zombie but some tasks are still running, so removeSchedulable isn't called yet (it gets called in TaskSchedulerImpl only after all running tasks in the stage have finished).

(2) The map stage re-runs and a new attempt for the stage begins. This attempt has the same stage ID, so will have the same schedulable name.

I remember having a long discussion about (1) a while ago (and when we should call removeSchedulable) and decided it was most "fair" to call it only after all running tasks complete, because running tasks should be counted towards the pools share even when they're for zombie task sets.

I think in this case, the last-schedulable-wins policy that currently exists seems better, although still wrong. I'd argue we should first fix this bug (by giving each TaskSetManager a unique name), in a separate PR, and then do the fix to this PR that you suggested, Mark. The other fix seems like it should be relatively simple. @markhamstra thoughts? Does that seem reasonable and do you buy the bug description above?

erenavsarogullari · 2016-10-04T00:21:36Z

Thanks @kayousterhout and @markhamstra.
In the light of the decision, i can happily work on the new ticket(Uniqueness of TaskSetManager name) as well ;)

erenavsarogullari · 2016-10-12T22:09:37Z

Hi @kayousterhout and @markhamstra,

In the light of our previous discussion, i committed second patch as e126cd8. This PR' s changeset looks ok but it still needs unique TaskSetManager name support to cover all duplicate Schedulable cases so i raised SPARK-17894 for uniqueness of TaskSetManager name. It proposes a new TaskSetManager name pattern to be reviewed and i will publish separate PR ;)

`TaskSetManager` should have unique name to avoid adding duplicate ones to parent `Pool` via `SchedulableBuilder`. This problem has been surfaced with following discussion: [[PR: Avoid adding duplicate schedulables]](apache#15326) **Proposal** : There is 1x1 relationship between `stageAttemptId` and `TaskSetManager` so `taskSet.Id` covering both `stageId` and `stageAttemptId` looks to be used for uniqueness of `TaskSetManager` name instead of just `stageId`. **Current TaskSetManager Name** : `var name = "TaskSet_" + taskSet.stageId.toString` **Sample**: TaskSet_0 **Proposed TaskSetManager Name** : `val name = "TaskSet_" + taskSet.Id ` `// taskSet.Id = (stageId + "." + stageAttemptId)` **Sample** : TaskSet_0.0 Added new Unit Test. Author: erenavsarogullari <[email protected]> Closes apache#15463 from erenavsarogullari/SPARK-17894.

erenavsarogullari · 2016-10-25T22:47:32Z

Hi @kayousterhout and @markhamstra,

Related PR #15463 is merged and this PR is ready for review. Also previous comments have been addressed with e126cd8ec51b11fed5ffaab376c6d4a451086cac.

Thanks in advance.

erenavsarogullari · 2016-10-27T17:28:18Z

Kindly reminder :)
cc @kayousterhout @markhamstra

`TaskSetManager` should have unique name to avoid adding duplicate ones to parent `Pool` via `SchedulableBuilder`. This problem has been surfaced with following discussion: [[PR: Avoid adding duplicate schedulables]](apache#15326) **Proposal** : There is 1x1 relationship between `stageAttemptId` and `TaskSetManager` so `taskSet.Id` covering both `stageId` and `stageAttemptId` looks to be used for uniqueness of `TaskSetManager` name instead of just `stageId`. **Current TaskSetManager Name** : `var name = "TaskSet_" + taskSet.stageId.toString` **Sample**: TaskSet_0 **Proposed TaskSetManager Name** : `val name = "TaskSet_" + taskSet.Id ` `// taskSet.Id = (stageId + "." + stageAttemptId)` **Sample** : TaskSet_0.0 Added new Unit Test. Author: erenavsarogullari <[email protected]> Closes apache#15463 from erenavsarogullari/SPARK-17894.

erenavsarogullari · 2016-11-03T21:54:11Z

Hi @kayousterhout and @markhamstra,
This PR is ready for review if it is possible.
All feedbacks are welcome and thanks in advance ;)

kayousterhout · 2016-12-16T01:31:20Z

@erenavsarogullari I looked at this again (sorry for the long delay), and it looks like you maintained the old behavior of last-added-wins. I thought from the discussion above, @markhamstra was ok with first-added-wins, which I think is more intuitive and consistent with other locations in the codebase. Did I miss something from above?

`TaskSetManager` should have unique name to avoid adding duplicate ones to parent `Pool` via `SchedulableBuilder`. This problem has been surfaced with following discussion: [[PR: Avoid adding duplicate schedulables]](apache#15326) **Proposal** : There is 1x1 relationship between `stageAttemptId` and `TaskSetManager` so `taskSet.Id` covering both `stageId` and `stageAttemptId` looks to be used for uniqueness of `TaskSetManager` name instead of just `stageId`. **Current TaskSetManager Name** : `var name = "TaskSet_" + taskSet.stageId.toString` **Sample**: TaskSet_0 **Proposed TaskSetManager Name** : `val name = "TaskSet_" + taskSet.Id ` `// taskSet.Id = (stageId + "." + stageAttemptId)` **Sample** : TaskSet_0.0 Added new Unit Test. Author: erenavsarogullari <[email protected]> Closes apache#15463 from erenavsarogullari/SPARK-17894.

kayousterhout · 2017-02-07T00:17:08Z

@erenavsarogullari what's the status of this PR?

kayousterhout · 2017-03-24T00:17:05Z

@erenavsarogullari is this ready to be updated now that #16813 has been merged?

erenavsarogullari · 2017-03-24T00:30:07Z

@kayousterhout, thanks for querying this PR.
It is not ready yet and needs a small change but it will be ready in a couple of days. Will let you know ;)

erenavsarogullari · 2017-03-29T22:41:59Z

Jenkins test this please

erenavsarogullari · 2017-03-30T08:31:27Z

Hi @kayousterhout and @markhamstra,

This PR is ready for review in the light of first-added-wins (Schedulables: Pool / TaskSetManager) pattern. All feedback are welcome in advance.
Two TaskSchedulerImpl UT cases have been failed due to duplicate TaskSet submission so they have also been fixed via latest commit: dd302ffb44e01280b42d130bee3ede9c81fd4839
Also jenkins needs to be triggered.

Thanks.

kayousterhout · 2017-04-02T23:00:12Z

Jenkins this is OK to test

kayousterhout · 2017-04-02T23:02:42Z

core/src/test/scala/org/apache/spark/scheduler/PoolSuite.scala

+    val taskSetManager0 = createTaskSetManager(0, 2, taskScheduler)
+    val taskSetManager1 = createTaskSetManager(1, 2, taskScheduler)
+    schedulableBuilder.addTaskSetManager(taskSetManager0, null)
+    schedulableBuilder.addTaskSetManager(taskSetManager0, null)


When would this happen? (the same TSM getting added twice)

Related fix aims to avoid adding duplicate schedulables (Pool and TSM). However, a new TSM is created for submitted TaskSet and duplicate TSM submission does not look an expected behaviour(although logic is robust enough for this case) so these test cases can be removed for clearer perspective.

core/src/test/scala/org/apache/spark/scheduler/PoolSuite.scala

kayousterhout · 2017-04-02T23:04:27Z

core/src/test/scala/org/apache/spark/scheduler/PoolSuite.scala

+    val taskSetManager0 = createTaskSetManager(0, 2, taskScheduler)
+    val taskSetManager1 = createTaskSetManager(1, 2, taskScheduler)
+    schedulableBuilder.addTaskSetManager(taskSetManager0, null)
+    schedulableBuilder.addTaskSetManager(taskSetManager0, null)


similar to the above -- when could this happen?

HyukjinKwon · 2017-05-11T13:36:00Z

Hi @erenavsarogullari, is it still active? if so could you address the comments above? Otherwise, I would like to propose to close this.

erenavsarogullari · 2017-05-11T17:58:47Z

Hi @HyukjinKwon, thanks to follow this PR. I am busy for a while. Implementation is done and just last comments need to be addressed.

I will address them as well asap. Thanks ;)

erenavsarogullari · 2017-06-08T01:47:48Z

Hi @HyukjinKwon, thanks for the following this PR again. This looks required but i am too busy for a while. Fix is already ready and will address the last comments asap. Sorry for delay again ;)

HyukjinKwon · 2017-06-08T02:07:39Z

I will take this out in the list. Thanks for your input.

HyukjinKwon · 2017-06-08T02:11:17Z

Gentle ping @kayousterhout.

kayousterhout · 2017-06-08T05:03:49Z

@HyukjinKwon what's the ping here for? It looks like I left some comments that @erenavsarogullari will address when he has time.

HyukjinKwon · 2017-06-08T05:26:39Z

I am sorry I misunderstood and thought it is almost (or already) ready. Will read the comments carefully next time.

erenavsarogullari · 2017-06-08T06:25:09Z

Hi @kayousterhout,
Many thanks again for your review and sorry about delay.
Last comments have just been addressed and patch is ready for re-review.

erenavsarogullari · 2017-09-11T20:58:45Z

Hi @kayousterhout,
Many thanks again for your review.
Patch is ready to re-review.

…ols.

erenavsarogullari · 2019-06-01T10:25:08Z

Jenkins test this please

erenavsarogullari · 2019-06-01T10:38:40Z

Hi All,
Sorry for delay on this PR. Last comments have been addressed and merge-conflict has been fixed.
Patch is ready to re-review. Please let me know if you need further details. Thanks in advance 👍
cc @kayousterhout

erenavsarogullari · 2019-06-02T08:55:15Z

retest this please

github-actions · 2020-01-17T00:13:57Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

erenavsarogullari changed the title ~~[SPARK-17759] [CORE] SchedulableBuilder should avoid to create duplicate fair scheduler-pools~~ [SPARK-17759] [CORE] FairSchedulableBuilder should avoid to create duplicate fair scheduler-pools Oct 2, 2016

erenavsarogullari changed the title ~~[SPARK-17759] [CORE] FairSchedulableBuilder should avoid to create duplicate fair scheduler-pools~~ [SPARK-17759] [CORE] Avoid adding of duplicate schedulables Oct 13, 2016

erenavsarogullari mentioned this pull request Oct 13, 2016

[SPARK-17894] [CORE] Ensure uniqueness of TaskSetManager name. #15463

Closed

erenavsarogullari changed the title ~~[SPARK-17759] [CORE] Avoid adding of duplicate schedulables~~ [SPARK-17759] [CORE] Avoid adding duplicate schedulables Oct 13, 2016

erenavsarogullari mentioned this pull request Feb 8, 2017

[SPARK-18066] [CORE] [TESTS] Add Pool usage policies test coverage for FIFO & FAIR Schedulers #15604

Closed

kayousterhout suggested changes Apr 2, 2017

View reviewed changes

HyukjinKwon mentioned this pull request Jun 7, 2017

[INFRA] Close stale PRs #18223

Closed

erenavsarogullari and others added 5 commits June 1, 2019 11:08

SchedulableBuilder should avoid to create duplicate fair scheduler-po…

8685ae9

…ols.

FairSchedulableBuilder should avoid to create duplicate schedulable.

8dc728f

Review comments are addressed.

04e06fc

Review comments are addressed.

7db6825

Review comments are addressed and merge conflict has been fixed.

5d44ed8

dongjoon-hyun added the SCHEDULER label Jun 14, 2019

github-actions bot added the Stale label Jan 17, 2020

github-actions bot closed this Jan 18, 2020

[SPARK-17759] [CORE] Avoid adding duplicate schedulables #15326

[SPARK-17759] [CORE] Avoid adding duplicate schedulables #15326

Conversation

erenavsarogullari commented Oct 2, 2016 • edited Loading

What changes were proposed in this pull request?

markhamstra commented Oct 3, 2016

kayousterhout commented Oct 3, 2016

markhamstra commented Oct 3, 2016

kayousterhout commented Oct 3, 2016

markhamstra commented Oct 3, 2016

markhamstra commented Oct 3, 2016

erenavsarogullari commented Oct 3, 2016 • edited Loading

markhamstra commented Oct 3, 2016

kayousterhout commented Oct 3, 2016 • edited Loading

erenavsarogullari commented Oct 4, 2016

erenavsarogullari commented Oct 12, 2016

erenavsarogullari commented Oct 25, 2016

erenavsarogullari commented Oct 27, 2016

erenavsarogullari commented Nov 3, 2016

kayousterhout commented Dec 16, 2016

kayousterhout commented Feb 7, 2017

kayousterhout commented Mar 24, 2017

erenavsarogullari commented Mar 24, 2017

erenavsarogullari commented Mar 29, 2017

erenavsarogullari commented Mar 30, 2017

kayousterhout commented Apr 2, 2017

kayousterhout Apr 2, 2017

Choose a reason for hiding this comment

erenavsarogullari Jun 8, 2017

Choose a reason for hiding this comment

kayousterhout Apr 2, 2017

Choose a reason for hiding this comment

HyukjinKwon commented May 11, 2017

erenavsarogullari commented May 11, 2017

erenavsarogullari commented Jun 8, 2017

HyukjinKwon commented Jun 8, 2017

HyukjinKwon commented Jun 8, 2017

kayousterhout commented Jun 8, 2017

HyukjinKwon commented Jun 8, 2017 • edited Loading

erenavsarogullari commented Jun 8, 2017

erenavsarogullari commented Sep 11, 2017

erenavsarogullari commented Jun 1, 2019

erenavsarogullari commented Jun 1, 2019

erenavsarogullari commented Jun 2, 2019

github-actions bot commented Jan 17, 2020

erenavsarogullari commented Oct 2, 2016 •

edited

Loading

erenavsarogullari commented Oct 3, 2016 •

edited

Loading

kayousterhout commented Oct 3, 2016 •

edited

Loading

HyukjinKwon commented Jun 8, 2017 •

edited

Loading