Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-17759] [CORE] Avoid adding duplicate schedulables #15326

Closed
wants to merge 5 commits into from
Closed

[SPARK-17759] [CORE] Avoid adding duplicate schedulables #15326

wants to merge 5 commits into from

Conversation

erenavsarogullari
Copy link
Member

@erenavsarogullari erenavsarogullari commented Oct 2, 2016

What changes were proposed in this pull request?

If spark.scheduler.allocation.file has duplicate pools, all of them are created when SparkContext is initialized but just one of them is used and the other ones look redundant. This causes redundant pool creation and needs to be fixed.

Code to Reproduce :

val conf = new SparkConf().setAppName("spark-fairscheduler").setMaster("local")
conf.set("spark.scheduler.mode", "FAIR")
conf.set("spark.scheduler.allocation.file", "src/main/resources/fairscheduler-duplicate-pools.xml")
val sc = new SparkContext(conf)

fairscheduler-duplicate-pools.xml :

The following sample just shows two default and duplicate_pool1 but this also needs to be thought for N default and/or other duplicate pools.

<allocations>
    <pool name="default">
        <minShare>0</minShare>
        <weight>1</weight>
        <schedulingMode>FIFO</schedulingMode>
    </pool>
    <pool name="default">
        <minShare>0</minShare>
        <weight>1</weight>
        <schedulingMode>FIFO</schedulingMode>
    </pool>
    <pool name="duplicate_pool1">
        <minShare>1</minShare>
        <weight>1</weight>
        <schedulingMode>FAIR</schedulingMode>
    </pool>
    <pool name="duplicate_pool1">
        <minShare>2</minShare>
        <weight>2</weight>
        <schedulingMode>FAIR</schedulingMode>
    </pool>
</allocations>

Debug Screenshot :
The following screenshots show Pool.schedulableQueue(ConcurrentLinkedQueue[Schedulable]) has 4 pools as

default, default, duplicate_pool1 and duplicate_pool1

but Pool.schedulableNameToSchedulable(ConcurrentHashMap[String, Schedulable]) has

default and duplicate_pool1

due to pool name as key so one of default and duplicate_pool1 look redundant and live in Pool.schedulableQueue.

duplicate_pools

duplicate_pools2

## How was this patch tested?

Added new Unit Test case.

@erenavsarogullari erenavsarogullari changed the title [SPARK-17759] [CORE] SchedulableBuilder should avoid to create duplicate fair scheduler-pools [SPARK-17759] [CORE] FairSchedulableBuilder should avoid to create duplicate fair scheduler-pools Oct 2, 2016
@markhamstra
Copy link
Contributor

I've got some issues with this PR.

  • You've unnecessarily changed the prior behavior where the last added pool wins. We should preserve that behavior in case anyone was relying upon it to handle their broken configuration.
  • Adding duplicate TaskSetManagers (or other Schedulable types in the future) would also be an error, so I'd rather do the checking and de-duplication in addSchedulable instead of in buildFairSchedulerPool. Like this in Pool:
  override def addSchedulable(schedulable: Schedulable) {
    require(schedulable != null)
    val name = schedulable.name
    if (null == schedulableNameToSchedulable.put(name, schedulable)) {
      schedulableQueue.add(schedulable)
    } else {
      logWarning(s"Duplicate Schedulable added: $name")
      // remove previously enqueued schedulable with same name
      schedulableQueue.remove(getSchedulableByName(name))
      if (!schedulableQueue.contains(schedulable)) {
        schedulableQueue.add(schedulable)
      }
    }
    schedulable.parent = this
  }

The only routes to this code are from the one-time initialization of the backend on the creation of a SparkContext and from schedulableBuilder.addTaskSetManager in TaskSchedulerImpl#submitTasks, which is synchronized -- so the above should be fine in terms of thread safety.

This is also the only route by which a Schedulable can be added to schedulableQueue, so with this code there should never be more than one prior instance of a queued Schedulable with the same name that needs to be removed before replacing it with the duplicate (which matches the schedulableNameToSchedulable entry).

This should pass all the existing tests; but if you do something more severe than just logging the warning (such as adding a System.exit to that branch), you'll find that we are actually doing some abusive adding of duplicate schedulables within some tests.

@kayousterhout

@kayousterhout
Copy link
Contributor

+1 to Mark's suggestion about putting the behavior in Pool::addSchedulable, so that it works across all Schedulables.

I'm less convinced about maintaining the existing behavior where the last added pool wins. I think people usually expect first-wins behavior for these things (e.g., that's the case for Java classpath issues), and the code is also simpler for that approach. Given that this should be affecting pretty few people (and there's now a warning logged), I think this would be fine to change.

@markhamstra
Copy link
Contributor

@kayousterhout @rxin Ok, but if we're going to change the behavior, then we need to be sure that change at least makes it into the release notes.

@kayousterhout
Copy link
Contributor

@markhamstra if we do change it, we should prob merge only into master and not 2.0.1.

@markhamstra
Copy link
Contributor

Yes, that is what I was thinking regardless.

@markhamstra
Copy link
Contributor

...and it would be 2.0.2 at this point. :)

@erenavsarogullari
Copy link
Member Author

erenavsarogullari commented Oct 3, 2016

Firstly, thanks @markhamstra and @kayousterhout for quick feedbacks.

I was aware of putting the check in Pool.addSchedulable to cover for all Schedulables (Pool and TaskSetManager for now). My concern was that TaskSetManager uses the following naming-pattern:

var name = "TaskSet_" + taskSet.stageId.toString

and if there is a case that a stage having multiple TaskSet/TaskSetManager, this could be a problem. However, your comments already clarified this so i agree moving to Pool.addSchedulable and address with next commit ;)

@markhamstra
Copy link
Contributor

@erenavsarogullari Your concern was entirely legitimate, and is also why I called in @kayousterhout to double-check my claim that other duplicate Schedulables would also be a problem.

@kayousterhout
Copy link
Contributor

kayousterhout commented Oct 3, 2016

@erenavsarogullari @markhamstra I just looked at this further and I actually think this could be an issue:

(1) The first attempt for a stage fails with a fetch failure. The associated TaskSetManager is marked as a zombie but some tasks are still running, so removeSchedulable isn't called yet (it gets called in TaskSchedulerImpl only after all running tasks in the stage have finished).

(2) The map stage re-runs and a new attempt for the stage begins. This attempt has the same stage ID, so will have the same schedulable name.

I remember having a long discussion about (1) a while ago (and when we should call removeSchedulable) and decided it was most "fair" to call it only after all running tasks complete, because running tasks should be counted towards the pools share even when they're for zombie task sets.

I think in this case, the last-schedulable-wins policy that currently exists seems better, although still wrong. I'd argue we should first fix this bug (by giving each TaskSetManager a unique name), in a separate PR, and then do the fix to this PR that you suggested, Mark. The other fix seems like it should be relatively simple. @markhamstra thoughts? Does that seem reasonable and do you buy the bug description above?

@erenavsarogullari
Copy link
Member Author

Thanks @kayousterhout and @markhamstra.
In the light of the decision, i can happily work on the new ticket(Uniqueness of TaskSetManager name) as well ;)

@erenavsarogullari
Copy link
Member Author

Hi @kayousterhout and @markhamstra,

In the light of our previous discussion, i committed second patch as e126cd8. This PR' s changeset looks ok but it still needs unique TaskSetManager name support to cover all duplicate Schedulable cases so i raised SPARK-17894 for uniqueness of TaskSetManager name. It proposes a new TaskSetManager name pattern to be reviewed and i will publish separate PR ;)

@erenavsarogullari erenavsarogullari changed the title [SPARK-17759] [CORE] FairSchedulableBuilder should avoid to create duplicate fair scheduler-pools [SPARK-17759] [CORE] Avoid adding of duplicate schedulables Oct 13, 2016
@erenavsarogullari erenavsarogullari changed the title [SPARK-17759] [CORE] Avoid adding of duplicate schedulables [SPARK-17759] [CORE] Avoid adding duplicate schedulables Oct 13, 2016
ghost pushed a commit to dbtsai/spark that referenced this pull request Oct 24, 2016
`TaskSetManager` should have unique name to avoid adding duplicate ones to parent `Pool` via `SchedulableBuilder`. This problem has been surfaced with following discussion: [[PR: Avoid adding duplicate schedulables]](apache#15326)

**Proposal** :
There is 1x1 relationship between `stageAttemptId` and `TaskSetManager` so `taskSet.Id` covering both `stageId` and `stageAttemptId` looks to be used for uniqueness of `TaskSetManager` name instead of just `stageId`.

**Current TaskSetManager Name** :
`var name = "TaskSet_" + taskSet.stageId.toString`
**Sample**: TaskSet_0

**Proposed TaskSetManager Name** :
`val name = "TaskSet_" + taskSet.Id ` `// taskSet.Id = (stageId + "." + stageAttemptId)`
**Sample** : TaskSet_0.0

Added new Unit Test.

Author: erenavsarogullari <[email protected]>

Closes apache#15463 from erenavsarogullari/SPARK-17894.
@erenavsarogullari
Copy link
Member Author

Hi @kayousterhout and @markhamstra,

Related PR #15463 is merged and this PR is ready for review. Also previous comments have been addressed with e126cd8ec51b11fed5ffaab376c6d4a451086cac.

Thanks in advance.

@erenavsarogullari
Copy link
Member Author

Kindly reminder :)
cc @kayousterhout @markhamstra

robert3005 pushed a commit to palantir/spark that referenced this pull request Nov 1, 2016
`TaskSetManager` should have unique name to avoid adding duplicate ones to parent `Pool` via `SchedulableBuilder`. This problem has been surfaced with following discussion: [[PR: Avoid adding duplicate schedulables]](apache#15326)

**Proposal** :
There is 1x1 relationship between `stageAttemptId` and `TaskSetManager` so `taskSet.Id` covering both `stageId` and `stageAttemptId` looks to be used for uniqueness of `TaskSetManager` name instead of just `stageId`.

**Current TaskSetManager Name** :
`var name = "TaskSet_" + taskSet.stageId.toString`
**Sample**: TaskSet_0

**Proposed TaskSetManager Name** :
`val name = "TaskSet_" + taskSet.Id ` `// taskSet.Id = (stageId + "." + stageAttemptId)`
**Sample** : TaskSet_0.0

Added new Unit Test.

Author: erenavsarogullari <[email protected]>

Closes apache#15463 from erenavsarogullari/SPARK-17894.
@erenavsarogullari
Copy link
Member Author

Hi @kayousterhout and @markhamstra,
This PR is ready for review if it is possible.
All feedbacks are welcome and thanks in advance ;)

@kayousterhout
Copy link
Contributor

@erenavsarogullari I looked at this again (sorry for the long delay), and it looks like you maintained the old behavior of last-added-wins. I thought from the discussion above, @markhamstra was ok with first-added-wins, which I think is more intuitive and consistent with other locations in the codebase. Did I miss something from above?

uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
`TaskSetManager` should have unique name to avoid adding duplicate ones to parent `Pool` via `SchedulableBuilder`. This problem has been surfaced with following discussion: [[PR: Avoid adding duplicate schedulables]](apache#15326)

**Proposal** :
There is 1x1 relationship between `stageAttemptId` and `TaskSetManager` so `taskSet.Id` covering both `stageId` and `stageAttemptId` looks to be used for uniqueness of `TaskSetManager` name instead of just `stageId`.

**Current TaskSetManager Name** :
`var name = "TaskSet_" + taskSet.stageId.toString`
**Sample**: TaskSet_0

**Proposed TaskSetManager Name** :
`val name = "TaskSet_" + taskSet.Id ` `// taskSet.Id = (stageId + "." + stageAttemptId)`
**Sample** : TaskSet_0.0

Added new Unit Test.

Author: erenavsarogullari <[email protected]>

Closes apache#15463 from erenavsarogullari/SPARK-17894.
@kayousterhout
Copy link
Contributor

@erenavsarogullari what's the status of this PR?

@kayousterhout
Copy link
Contributor

@erenavsarogullari is this ready to be updated now that #16813 has been merged?

@erenavsarogullari
Copy link
Member Author

@kayousterhout, thanks for querying this PR.
It is not ready yet and needs a small change but it will be ready in a couple of days. Will let you know ;)

@erenavsarogullari
Copy link
Member Author

Jenkins test this please

@erenavsarogullari
Copy link
Member Author

Hi @kayousterhout and @markhamstra,

  • This PR is ready for review in the light of first-added-wins (Schedulables: Pool / TaskSetManager) pattern. All feedback are welcome in advance.

  • Two TaskSchedulerImpl UT cases have been failed due to duplicate TaskSet submission so they have also been fixed via latest commit: dd302ffb44e01280b42d130bee3ede9c81fd4839

  • Also jenkins needs to be triggered.

Thanks.

@kayousterhout
Copy link
Contributor

Jenkins this is OK to test

val taskSetManager0 = createTaskSetManager(0, 2, taskScheduler)
val taskSetManager1 = createTaskSetManager(1, 2, taskScheduler)
schedulableBuilder.addTaskSetManager(taskSetManager0, null)
schedulableBuilder.addTaskSetManager(taskSetManager0, null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would this happen? (the same TSM getting added twice)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related fix aims to avoid adding duplicate schedulables (Pool and TSM). However, a new TSM is created for submitted TaskSet and duplicate TSM submission does not look an expected behaviour(although logic is robust enough for this case) so these test cases can be removed for clearer perspective.

val taskSetManager0 = createTaskSetManager(0, 2, taskScheduler)
val taskSetManager1 = createTaskSetManager(1, 2, taskScheduler)
schedulableBuilder.addTaskSetManager(taskSetManager0, null)
schedulableBuilder.addTaskSetManager(taskSetManager0, null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to the above -- when could this happen?

@HyukjinKwon
Copy link
Member

Hi @erenavsarogullari, is it still active? if so could you address the comments above? Otherwise, I would like to propose to close this.

@erenavsarogullari
Copy link
Member Author

Hi @HyukjinKwon, thanks to follow this PR. I am busy for a while. Implementation is done and just last comments need to be addressed.

I will address them as well asap. Thanks ;)

@erenavsarogullari
Copy link
Member Author

Hi @HyukjinKwon, thanks for the following this PR again. This looks required but i am too busy for a while. Fix is already ready and will address the last comments asap. Sorry for delay again ;)

@HyukjinKwon
Copy link
Member

I will take this out in the list. Thanks for your input.

@HyukjinKwon
Copy link
Member

Gentle ping @kayousterhout.

@kayousterhout
Copy link
Contributor

@HyukjinKwon what's the ping here for? It looks like I left some comments that @erenavsarogullari will address when he has time.

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Jun 8, 2017

I am sorry I misunderstood and thought it is almost (or already) ready. Will read the comments carefully next time.

@erenavsarogullari
Copy link
Member Author

Hi @kayousterhout,
Many thanks again for your review and sorry about delay.
Last comments have just been addressed and patch is ready for re-review.

@erenavsarogullari
Copy link
Member Author

Hi @kayousterhout,
Many thanks again for your review.
Patch is ready to re-review.

@erenavsarogullari
Copy link
Member Author

Jenkins test this please

@erenavsarogullari
Copy link
Member Author

Hi All,
Sorry for delay on this PR. Last comments have been addressed and merge-conflict has been fixed.
Patch is ready to re-review. Please let me know if you need further details. Thanks in advance 👍
cc @kayousterhout

@erenavsarogullari
Copy link
Member Author

retest this please

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Jan 17, 2020
@github-actions github-actions bot closed this Jan 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants