-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23754][BRANCH-2.3][PYTHON] Re-raising StopIteration in client code #21463
Conversation
Make sure that `StopIteration`s raised in users' code do not silently interrupt processing by spark, but are raised as exceptions to the users. The users' functions are wrapped in `safe_iter` (in `shuffle.py`), which re-raises `StopIteration`s as `RuntimeError`s Unit tests, making sure that the exceptions are indeed raised. I am not sure how to check whether a `Py4JJavaError` contains my exception, so I simply looked for the exception message in the java exception's `toString`. Can you propose a better way? This is my original work, licensed in the same way as spark Author: e-dorigatti <[email protected]> Closes apache#21383 from e-dorigatti/fix_spark_23754. (cherry picked from commit 0ebb0c0)
Test build #91306 has finished for PR 21463 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merged to branch-2.3. @e-dorigatti, in case of other branches, it should be manually closed by you. |
…code ## What changes are proposed Make sure that `StopIteration`s raised in users' code do not silently interrupt processing by spark, but are raised as exceptions to the users. The users' functions are wrapped in `safe_iter` (in `shuffle.py`), which re-raises `StopIteration`s as `RuntimeError`s ## How were the changes tested Unit tests, making sure that the exceptions are indeed raised. I am not sure how to check whether a `Py4JJavaError` contains my exception, so I simply looked for the exception message in the java exception's `toString`. Can you propose a better way? This is my original work, licensed in the same way as spark --- Author: e-dorigatti <emilio.dorigattigmail.com> Closes #21383 from e-dorigatti/fix_spark_23754. (cherry picked from commit 0ebb0c0) Author: e-dorigatti <[email protected]> Closes #21463 from e-dorigatti/branch-2.3.
@e-dorigatti, I think this should really be cleaned up in the master branch. Mind if I ask to give a try with fixing it in worker side and revert 0ebb0c0 in the master branch? |
@HyukjinKwon go ahead, no problem |
oh, I meant I was wondering if you could have some time to give another try by fixing it in worker side. You could revert the current approach in your PR and try the fix in worker side. |
Let me leave cc for you @icexelloss or @viirya. Seems we should really clean up in the master (and probably backport it again to branch-2.3) and try it in worker side - I need to take a closer look too. If you guys all happened to be busy, will take a look by myself. I merged this to unblock the release. |
@HyukjinKwon I'll spend some time to look at it. But not guarantee how much time. You can work on it if I don't come out a work for it. :) |
@HyukjinKwon haha no probs :) I have to open a new PR tho as yesterday I messed up my repos. |
What changes are proposed
Make sure that
StopIteration
s raised in users' code do not silently interrupt processing by spark, but are raised as exceptions to the users. The users' functions are wrapped insafe_iter
(inshuffle.py
), which re-raisesStopIteration
s asRuntimeError
sHow were the changes tested
Unit tests, making sure that the exceptions are indeed raised. I am not sure how to check whether a
Py4JJavaError
contains my exception, so I simply looked for the exception message in the java exception'stoString
. Can you propose a better way?This is my original work, licensed in the same way as spark
Author: e-dorigatti [email protected]
Closes #21383 from e-dorigatti/fix_spark_23754.
(cherry picked from commit 0ebb0c0)