Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add note about pushing the lazy XCom proxy to XCom #27250

Merged
merged 1 commit into from
Nov 3, 2022

Conversation

uranusjr
Copy link
Member

@uranusjr uranusjr commented Oct 25, 2022

See discussion in #27209 for context. This simply adds a note about the situation. An alternative approach is explored in #27251 that tries to "hide" the issue magically instead.

def forward_values(values):
return values # This is a lazy proxy and can't be pushed!

You need to explicitly call ``list(values)`` instead, and accept the performance implications.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This invites the question of what implications and how can you tell if it's dramatic or not.
If this is something we discourage under any circumstances I would use a stronger language

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It entirely depends on what the user pushes into the XCom and how the data is stored so there’s no way to really describe fully. All the normal XCom caveats apply, times how many upstream tasks were mapped into.

@potiuk
Copy link
Member

potiuk commented Oct 26, 2022

This looks good - but if the #27251 is solid, I have strong preference for that one. Users don't read all the details in the documentation and they will expect it to "just work" . They don't even realise (and they should not) there is a lazy-proxy used. The whole taskflow API is there precisely to hide the implementation details (Xcom.push, Xcom.pull and the like) from the user, so this is entirely expected they have no idea about this detail. And since not many users read documentation, they will open an issue instead most likely (which we want to at least try to avoid).

Another option (if we cannot make #27251 solid) will be to make a very explict error message when they try to push LazyXCom and point them to this documentation directly - then they will have a chance to read it and even if they copy&paste the error message in the issue, we will just close the issue and tell them to follow the advice they got.

@uranusjr
Copy link
Member Author

One possibility would be to add an additiona check in XCom.set to emit a better error message to guide users to the relevant documentation and/or make the appropriate call.

@potiuk
Copy link
Member

potiuk commented Oct 31, 2022

I commented in #27251 -> I think automated coertion and smart warning in case of performance problem might be way better approach - both good for the users and maintainers. #27251 (comment)

@uranusjr
Copy link
Member Author

uranusjr commented Nov 3, 2022

Either way we take, documentation on this is needed, so I’m merging this first and continuing work in #27251.

@uranusjr uranusjr merged commit eb47c42 into apache:main Nov 3, 2022
@uranusjr uranusjr deleted the lazy-xcom-access-note-on-push branch November 3, 2022 08:15
@ephraimbuddy ephraimbuddy added the type:doc-only Changelog: Doc Only label Nov 16, 2022
@ashb ashb added this to the Airflow 2.5.0 milestone Nov 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants