-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes CoGroupByKey typehint from List to Iterable #22984
Conversation
R: tvalentyn I want to talk about this before I submit. The coGroupByKey implementation specifically does make a dictionary with a Key to list pairing. However, the documentation is very explicit that users should expect an iterable. I want to double check one last time we are not making a breaking change unless we have to. |
Codecov Report
@@ Coverage Diff @@
## master #22984 +/- ##
==========================================
- Coverage 73.58% 73.50% -0.09%
==========================================
Files 716 718 +2
Lines 95311 95438 +127
==========================================
+ Hits 70138 70148 +10
- Misses 23877 23989 +112
- Partials 1296 1301 +5
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Assigning reviewers. If you would like to opt out of this review, comment R: @TheNeuralBit for label python. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
R: @tvalentyn |
@ryanthompson591 I can't comment on your doc so I'll mention here - it encourages users to materialize the iterable result to a list, which can lead to OOMs. I think it's ok to mention casting to a list as a last resort, but the doc should also encourage users to iterate over the iterable without materializing it if at all possible. |
Run Python PreCommit |
@TheNeuralBit Thanks I'll update the doc to reflect that, discouraging changing the iterable to a list. |
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we update CHANGES.md about this linking to public documentation stating how people are to fix their usage?
Fixes issue #21556 |
CHANGES.md
Outdated
@@ -66,7 +66,7 @@ | |||
|
|||
## Breaking Changes | |||
|
|||
* X behavior was changed ([#X](https://github.com/apache/beam/issues/X)). | |||
* Python SDK CoGroupByKey now outputs an iterable instead of a list. [#21556](https://github.com/apache/beam/issues/21556) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to link to your docs on how to fix upgrade issues here as well (https://docs.google.com/document/d/1RIzm8-g-0CyVsPb6yasjwokJQFoKHG4NjRUcKHKINu0/edit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
I think for this to take effect it has to be in the PR description (which you can edit to include it) - https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword |
@lukecwik I updated the CHANGES.md file. I think that since the 2.42 release is cut, it is now a good time to add this change. |
Run Python 3.9 PostCommit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still a bunch of open comments on the doc that need resolution.
Co-authored-by: Lukasz Cwik <[email protected]>
Thanks for the comments, I think I address the major concerns and added some new sample code. Feel free to request edit access to that doc. |
Run Python 3.9 PostCommit |
Adds information about what exactly is broken as well as how to fix.
Run Python PreCommit |
Simplified changes.md
Run Portable_Python PreCommit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM - thanks for working through all the changes/doc stuff.
@tvalentyn looks like you still have a couple open comments in https://docs.google.com/document/d/1RIzm8-g-0CyVsPb6yasjwokJQFoKHG4NjRUcKHKINu0/edit#heading=h.1thvihn3k3oi - none of them look blocking, but giving you a chance to disagree before merging this :)
LGTM |
Changes CoGroupByKey to output an Iterable instead of a List.
This will be a breaking change and I put together a document to describe it.
The documentation of CoGroupByKey is very specific that it returns iterable and not a lists.
fixes #21556
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).addresses #123
), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>
instead.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.