Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

Add a strategy to fall back to Vanilla Spark shuffle manager #1047

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

lviiii
Copy link
Contributor

@lviiii lviiii commented Jul 25, 2022

What changes were proposed in this pull request?

Add the strategy to fallback to Vanilla Spark shuffle manager.
o Enable fallback shuffle configuration and reuse the ColumnarShuffleExchangeExec
o Initiate the splitter iterator in Shuffle Dependency, and transform to the RDD: Produce2[Int, ColumnarBatch]
o Serialize the record batch to Shuffle Writer of Vanilla Spark.

How does this patch work?

When submit an application, we use native SQL engine with default ColumnarShuffleManager configuration,
--conf spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager

However, we want to specify the custom or other shuffle manager for some situations, to enable Vanilla Spark shuffle manager,
--conf spark.shuffle.manager=org.apache.spark.shuffle.sort.SortShuffleManager --conf spark.oap.sql.columnar.enableFallbackShuffle=true

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Hong and others added 3 commits June 16, 2022 17:06
…ect#978)

* [NSE-927] Add macro __AVX512BW__ check for different CPU architecture (oap-project#975)

* Add __AVX512BW__ check

* Fix cFormat

* [NSE-126] set default codegen opt to O1 for branch-1.4
@github-actions
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/native-sql-engine/issues

Then could you also rename commit message and pull request title in the following format?

[NSE-${ISSUES_ID}] ${detailed message}

See also:

@PHILO-HE
Copy link
Collaborator

Assuming the first two commits are not relevant to your patch, please do NOT include them. If your work depends on these commits, it would be better to open a dedicate PR to port them to main branch.

@PHILO-HE PHILO-HE changed the title Add the strategy to fallback to Vanilla Spark shuffle manager. Add a strategy to fall back to Vanilla Spark shuffle manager Jul 26, 2022
# Conflicts:
#	native-sql-engine/core/src/main/scala/com/intel/oap/expression/ConverterUtils.scala
@@ -83,6 +83,11 @@ class GazellePluginConfig(conf: SQLConf) extends Logging {
val enableColumnarShuffledHashJoin: Boolean =
conf.getConfString("spark.oap.sql.columnar.shuffledhashjoin", "true").toBoolean && enableCpu

// enable or disable fallback shuffle manager
val enableFallbackShuffle: Boolean = conf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please also add a short note on how to use this feature? and also make this default to false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added that in the description dialog

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants