Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-17556] [CORE] [SQL] Executor side broadcast for broadcast joins #15240

Closed
wants to merge 13 commits into from

Conversation

scwf
Copy link
Contributor

@scwf scwf commented Sep 26, 2016

What changes were proposed in this pull request?

JIRA: https://issues.apache.org/jira/browse/SPARK-17556
Design doc :
https://issues.apache.org/jira/secure/attachment/12830668/executor%20broadcast.pdf

Added two api for RDD to perform executor side broadcast and apply it on sql's broadcast join.

[1]. def broadcast[U: ClassTag](f: Iterator[T] => U): Broadcast[U]

User only need pass a function to describe how to translate all the element of the rdd to the value they want to broadcast

[2]. def broadcast[U: ClassTag](transFunc: TransFunc[T, U]): Broadcast[U]

This is only used in spark sql(spark internal), TransFunc is a interface to describe how to translate all the element of the rdd to a single value.
TransFunc is inherited by BroadcastMode in spark sql.

When construct broadcast, firstly it write blocks to block manager from executor and then create Broadcast from driver(not write blocks)

How was this patch tested?

added unit test and manual test.

@SparkQA
Copy link

SparkQA commented Sep 26, 2016

Test build #65907 has finished for PR 15240 at commit 89336e7.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 26, 2016

Test build #65909 has finished for PR 15240 at commit f0a26c3.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 26, 2016

Test build #65913 has finished for PR 15240 at commit 9a334df.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 26, 2016

Test build #65916 has finished for PR 15240 at commit 90217c0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 26, 2016

Test build #65914 has finished for PR 15240 at commit 5e2ee76.

  • This patch fails from timeout after a configured wait of 250m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 26, 2016

Test build #65918 has finished for PR 15240 at commit 5db6a3c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 26, 2016

Test build #65925 has finished for PR 15240 at commit 4714cae.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 28, 2016

Test build #66030 has finished for PR 15240 at commit 0b80837.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@scwf scwf changed the title [SPARK-17556] Executor side broadcast for broadcast joins [SPARK-17556] [CORE] [SQL] Executor side broadcast for broadcast joins Sep 29, 2016
@SparkQA
Copy link

SparkQA commented Sep 29, 2016

Test build #66088 has finished for PR 15240 at commit 3e77821.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 29, 2016

Test build #66095 has finished for PR 15240 at commit 147b053.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@scwf
Copy link
Contributor Author

scwf commented Oct 8, 2016

/cc @rxin can you help review this?

@SparkQA
Copy link

SparkQA commented Dec 30, 2016

Test build #70745 has finished for PR 15240 at commit cdab885.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 3, 2017

Test build #70804 has started for PR 15240 at commit cdab885.

@SparkQA
Copy link

SparkQA commented Jan 3, 2017

Test build #70807 has finished for PR 15240 at commit 26b03b2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 3, 2017

Test build #70818 has finished for PR 15240 at commit 26b03b2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@scwf
Copy link
Contributor Author

scwf commented Jan 3, 2017

retest this please

@SparkQA
Copy link

SparkQA commented Jan 4, 2017

Test build #70867 has finished for PR 15240 at commit 4a547a6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

Are you still working on this? @scwf

@viirya
Copy link
Member

viirya commented Jun 13, 2017

@jiangxb1987 there's another solution to this JIRA at #15178. We take different approaches. You can also take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants