-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify docstring for Pyspark's foreachPartition #2895
Conversation
Due to the underlying use of `mapPartitions` which requires a function that maps partitions to partitions, `foreachPartition` requires the function passed to be a generator function or return an iterable (although these results are discarded). This is currently not stated in the documentation except through the unexplained example. It would help users to understand that example and not waste time with this error: ``` TypeError: 'NoneType' object is not iterable ```
Can one of the admins verify this patch? |
Actually, we might want to just fix this and allow |
Oh. Now that I look at master, @JoshRosen, I see that it's already been fixed by @davis here. The fix just isn't in 1.1. I guess we should close this? |
Maybe we can backport SPARK-2871 to 1.1, since it looks like it also fixes a bunch of preservesPartitioning bugs. |
@JoshRosen It will be better if we could easily backport them. |
I'd love to see this happen. |
Ah - tdhopper, i think you meant @davies :) |
(Imagine what |
If you don't mind, could you close this PR since it has been subsumed by another commit? If we want to track the progress / backport status of a different fix, then we should do that in JIRA. |
@JoshRosen: Yup. Thanks. |
Due to the underlying use of
mapPartitions
which requires a function that maps partitions to partitions,foreachPartition
requires the function passed to be a generator function or return an iterable (although these results are discarded).This is currently not stated in the documentation except through the unexplained example. It would help users to understand that example and not waste time with this error: