Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7795] [Core] Speed up task scheduling in standalone mode by reusing serializer #6323

Closed
wants to merge 4 commits into from

Conversation

coolfrood
Copy link
Contributor

My experiments with scheduling very short tasks in standalone cluster mode indicated that a significant amount of time was being spent in scheduling the tasks (>500ms for 256 tasks). I found that most of the time was being spent in creating a new instance of serializer for each task. Changing this to just one serializer brought down the scheduling time to 8ms.

instead of creating a new one for each task.
@srowen
Copy link
Member

srowen commented May 21, 2015

@coolfrood have a look at https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark -- you missed a step or two. It's not 100% clear you can reuse the serializer but it is a good idea to float.

@coolfrood
Copy link
Contributor Author

Argh. Sorry. I'll create a JIRA for this and update the PR.

@coolfrood coolfrood changed the title Speed up task scheduling in standalone mode by reusing serializer [SPARK-7795] [Core] Speed up task scheduling in standalone mode by reusing serializer May 21, 2015
@JoshRosen
Copy link
Contributor

Serializer instances aren't thread-safe, but it looks like this single-threaded use should be fine. I implemented some similar optimizations in #5606 and am in the process of fixing a rare corner-case related to serializer re-use with uncommon Kryo configurations (#6293). Based on discussion in those patches, this change looks good to me.

We might actually be able to go a bit further and lift the serializer instance into a field of CoarseGrainedSchedulerBackend itself. That would be a bit riskier in terms of code review burden, since we'd have to double-check that CoarseGrainedSchedulerBackend is only ever used from a single thread (I think that this is actually the case, but would be good to double-check).

@JoshRosen
Copy link
Contributor

Jenkins, this is ok to test.

@coolfrood
Copy link
Contributor Author

@JoshRosen: it looks like launchTasks() is only called from makeOffers(), which in turn is only called from the Actor's receive, so it should be safe to move the serializer into a member field. Also, both launchTasks() and makeOffers() can be made private. What do you think?

@JoshRosen
Copy link
Contributor

Yep, looks like both launchTasks() and makeOffers() can be private. In general, I prefer to minimize the visibility of fields / methods as much as possible, so let's make these private.

My PR made a similar change to CoarseGrainedExecutorBackend. Over there, I added a comment to explain that the shared serializer code would need to be revisited if we ever switched to a parallel / non-thread-safe RPC endpoint for the actor: https://github.com/apache/spark/pull/5606/files#diff-79391110e9f26657e415aa169a004998R51. It would be good to make a similar comment here so that we don't forget to update this if DriverEndpoint is ever parallelized.

@SparkQA
Copy link

SparkQA commented May 21, 2015

Test build #33263 has finished for PR 6323 at commit 0b8ca93.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 21, 2015

Test build #33259 has finished for PR 6323 at commit fe530cd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 21, 2015

Test build #33266 has finished for PR 6323 at commit bd4a5dd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

This looks good to me. I'm going to loop in @zsxwing or @rxin for another pair of eyes / final sanity-check.

@zsxwing
Copy link
Member

zsxwing commented May 21, 2015

LGTM

@JoshRosen
Copy link
Contributor

Actually, one minor comment: ser currently has wider visibility than is strictly necessary: it's only used inside of the DriverEndpoint inner class, so it should be a field of that class rather than a field of CoarseGrainedSchedulerBackend.

@SparkQA
Copy link

SparkQA commented May 22, 2015

Test build #33348 has finished for PR 6323 at commit 12d8c9e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

LGTM, so I'm going to merge this to master. Although I think this is probably safe for 1.4, I don't want to risk conflicting with any attempts to cut another RC tonight, so I'm not going to pick it there for now. Thanks!

@asfgit asfgit closed this in a163574 May 23, 2015
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
…using serializer

My experiments with scheduling very short tasks in standalone cluster mode indicated that a significant amount of time was being spent in scheduling the tasks (>500ms for 256 tasks).  I found that most of the time was being spent in creating a new instance of serializer for each task.  Changing this to just one serializer brought down the scheduling time to 8ms.

Author: Akshat Aranya <[email protected]>

Closes apache#6323 from coolfrood/master and squashes the following commits:

12d8c9e [Akshat Aranya] Reduce visibility of serializer
bd4a5dd [Akshat Aranya] Style fix
0b8ca93 [Akshat Aranya] Incorporate review comments
fe530cd [Akshat Aranya] Speed up task scheduling in standalone mode by reusing serializer instead of creating a new one for each task.
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
…using serializer

My experiments with scheduling very short tasks in standalone cluster mode indicated that a significant amount of time was being spent in scheduling the tasks (>500ms for 256 tasks).  I found that most of the time was being spent in creating a new instance of serializer for each task.  Changing this to just one serializer brought down the scheduling time to 8ms.

Author: Akshat Aranya <[email protected]>

Closes apache#6323 from coolfrood/master and squashes the following commits:

12d8c9e [Akshat Aranya] Reduce visibility of serializer
bd4a5dd [Akshat Aranya] Style fix
0b8ca93 [Akshat Aranya] Incorporate review comments
fe530cd [Akshat Aranya] Speed up task scheduling in standalone mode by reusing serializer instead of creating a new one for each task.
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
…using serializer

My experiments with scheduling very short tasks in standalone cluster mode indicated that a significant amount of time was being spent in scheduling the tasks (>500ms for 256 tasks).  I found that most of the time was being spent in creating a new instance of serializer for each task.  Changing this to just one serializer brought down the scheduling time to 8ms.

Author: Akshat Aranya <[email protected]>

Closes apache#6323 from coolfrood/master and squashes the following commits:

12d8c9e [Akshat Aranya] Reduce visibility of serializer
bd4a5dd [Akshat Aranya] Style fix
0b8ca93 [Akshat Aranya] Incorporate review comments
fe530cd [Akshat Aranya] Speed up task scheduling in standalone mode by reusing serializer instead of creating a new one for each task.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants