-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pyspark need Py2 to work, graceful and helping fail if system default is Py3 #392
Conversation
Can one of the admins verify this patch? |
Is there any way to do this test in Python instead of in bash? It looks like complicated and potentially brittle bash code. |
@mateiz handling it from Python would error out after entering initiating Python once and logging it out, thus didn't solved it that way... though that style is available as PullRequest either of the PullRequests would do the job as per brittle bash code, you can take view of any bash veteran you know... |
Thanks @abhishekkr. I am going to merge #399. Do you mind closing this one? |
…lt is Py3 from shell.py Python alternative for #392; managed from shell.py Author: AbhishekKr <[email protected]> Closes #399 from abhishekkr/pyspark_shell and squashes the following commits: 134bdc9 [AbhishekKr] pyspark require Python2, failing if system default is Py3 from shell.py
…lt is Py3 from shell.py Python alternative for #392; managed from shell.py Author: AbhishekKr <[email protected]> Closes #399 from abhishekkr/pyspark_shell and squashes the following commits: 134bdc9 [AbhishekKr] pyspark require Python2, failing if system default is Py3 from shell.py (cherry picked from commit bb76eae) Signed-off-by: Reynold Xin <[email protected]>
Stop SparkListenerBus daemon thread when DAGScheduler is stopped. Otherwise this leads to hundreds of SparkListenerBus daemon threads in our unit tests (and also problematic if user applications launches multiple SparkContext).
Better error handling in Spark Streaming and more API cleanup Earlier errors in jobs generated by Spark Streaming (or in the generation of jobs) could not be caught from the main driver thread (i.e. the thread that called StreamingContext.start()) as it would be thrown in different threads. With this change, after `ssc.start`, one can call `ssc.awaitTermination()` which will be block until the ssc is closed, or there is an exception. This makes it easier to debug. This change also adds ssc.stop(<stop-spark-context>) where you can stop StreamingContext without stopping the SparkContext. Also fixes the bug that came up with PRs apache#393 and apache#381. MetadataCleaner default value has been changed from 3500 to -1 for normal SparkContext and 3600 when creating a StreamingContext. Also, updated StreamingListenerBus with changes similar to SparkListenerBus in apache#392. And changed a lot of protected[streaming] to private[streaming].
…lt is Py3 from shell.py Python alternative for apache#392; managed from shell.py Author: AbhishekKr <[email protected]> Closes apache#399 from abhishekkr/pyspark_shell and squashes the following commits: 134bdc9 [AbhishekKr] pyspark require Python2, failing if system default is Py3 from shell.py
…with a concurrent map. (#392) * Replaced explicit synchronized access to hashmap with a concurrent map * Removed usages of scala.collection.concurrent.Map
…with a concurrent map. (apache#392) * Replaced explicit synchronized access to hashmap with a concurrent map * Removed usages of scala.collection.concurrent.Map
Add domain name mapping for terraform fusioncloud job
it failed on mine thus noticed