-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set spark.executor.uri from environment variable (needed by Mesos) #311
Conversation
Can one of the admins verify this patch? |
Jenkins, test this please. Good catch! |
Jenkins, test this please |
Merged build triggered. |
Merged build started. |
Merged build finished. |
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13785/ |
@ivanwick what is the symptom when this is not set correctly? If there is an exception or stacktrace it would be helpful to know what it does, so that other people who run into this problem can figure out that this is the fix for it. |
This patch fixes a bug with PySpark shell running on Mesos. Without the spark.executor.uri property, PySpark reports lost tasks because the slave is looking for the spark-executor in the wrong path and can never start it. It logs several "Lost TID" and "Executor lost", while the scheduler re-queues the lost tasks. They again fail for the same reason, finally ending with:
The stderr of each slave in the Mesos framework reports:
because this path doesn't exist on the slave nodes (this happens to be the path where it's installed on the head node). When spark.executor.uri is set, as it is with the Scala repl, Mesos is able to download the Spark dist package and run it from the framework temp directory on the slave. |
SPARK-991: Report information gleaned from a Python stacktrace in the UI Scala: - Added setCallSite/clearCallSite to SparkContext and JavaSparkContext. These functions mutate a LocalProperty called "externalCallSite." - Add a wrapper, getCallSite, that checks for an externalCallSite and, if none is found, calls the usual Utils.formatSparkCallSite. - Change everything that calls Utils.formatSparkCallSite to call getCallSite instead. Except getCallSite. - Add wrappers to setCallSite/clearCallSite wrappers to JavaSparkContext. Python: - Add a gruesome hack to rdd.py that inspects the traceback and guesses what you want to see in the UI. - Add a RAII wrapper around said gruesome hack that calls setCallSite/clearCallSite as appropriate. - Wire said RAII wrapper up around three calls into the Scala code. I'm not sure that I hit all the spots with the RAII wrapper. I'm also not sure that my gruesome hack does exactly what we want. One could also approach this change by refactoring runJob/submitJob/runApproximateJob to take a call site, then threading that parameter through everything that needs to know it. One might object to the pointless-looking wrappers in JavaSparkContext. Unfortunately, I can't directly access the SparkContext from Python---or, if I can, I don't know how---so I need to wrap everything that matters in JavaSparkContext. Conflicts: core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala
Thanks Ivan, I've merged this in. |
The Mesos backend uses this property when setting up a slave process. It is similarly set in the Scala repl (org.apache.spark.repl.SparkILoop), but I couldn't find any analogous for pyspark. Author: Ivan Wick <[email protected]> This patch had conflicts when merged, resolved by Committer: Matei Zaharia <[email protected]> Closes #311 from ivanwick/master and squashes the following commits: da0c3e4 [Ivan Wick] Set spark.executor.uri from environment variable (needed by Mesos) (cherry picked from commit 5cd11d5) Signed-off-by: Matei Zaharia <[email protected]>
The Mesos backend uses this property when setting up a slave process. It is similarly set in the Scala repl (org.apache.spark.repl.SparkILoop), but I couldn't find any analogous for pyspark. Author: Ivan Wick <[email protected]> This patch had conflicts when merged, resolved by Committer: Matei Zaharia <[email protected]> Closes apache#311 from ivanwick/master and squashes the following commits: da0c3e4 [Ivan Wick] Set spark.executor.uri from environment variable (needed by Mesos)
## What changes were proposed in this pull request? Redshift has no unsigned types (http://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html), thus they can map to Long and Integer without loss of precision. This integrates community PR apache#311 (github.com/databricks/spark-redshift/pull/311/), where the user claims that this change has fixed the scenarios where he would get a Decimal instead of a Long. I was not able to reproduce the user problem, but nevertheless the change removes code that handles unsigned types which actually dead, so it is not hurtful. ## How was this patch tested? Added tests from the community PR#311 Author: Juliusz Sompolski <[email protected]> Closes apache#180 from juliuszsompolski/SC-5620.
Set mocha timeout to 200000ms
The Mesos backend uses this property when setting up a slave process. It is similarly set in the Scala repl (org.apache.spark.repl.SparkILoop), but I couldn't find any analogous for pyspark.