Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance multiple_outputs inference of dict typing #19608

Merged
merged 6 commits into from
Jan 10, 2022

Conversation

josh-fell
Copy link
Contributor

Closes: #19538

There are a few cases in which the existing multiple_outputs inference in the TaskFlow API doesn't automatically unfurl a returned dictionary into separate XComs:

  • The return type annotation of the TaskFlow function is the basic dict type
  • In Python 3.6, the inference fails to set multiple_outputs to True due to use of an incorrect attr

This PR fixes the inference on a simple dict return type annotation and adds a condition for Python 3.6 use.

Since Python 3.6 will reach EOL on 2021-12-23, happy to drop the sys.version_info check if this fix wouldn't be included in a minor release before support for Python 3.6 is dropped.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Nov 16, 2021
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@josh-fell josh-fell marked this pull request as ready for review November 18, 2021 03:30
@josh-fell josh-fell force-pushed the taskflow-api-multiple-output-inference branch from 656e595 to fdaf89a Compare November 18, 2021 03:31
@josh-fell josh-fell force-pushed the taskflow-api-multiple-output-inference branch from fdaf89a to b92f49a Compare December 1, 2021 20:41
@josh-fell josh-fell force-pushed the taskflow-api-multiple-output-inference branch 2 times, most recently from 341740c to 3cc024d Compare December 10, 2021 16:31
@eladkal eladkal added this to the Airflow 2.2.4 milestone Dec 19, 2021
@josh-fell josh-fell force-pushed the taskflow-api-multiple-output-inference branch from 3cc024d to a7a6df1 Compare December 27, 2021 19:35
@josh-fell
Copy link
Contributor Author

@uranusjr I noticed in another PR that there were other references to tuples and lists in multiple_outputs docstrings. I figured I'd update them all here.

@josh-fell
Copy link
Contributor Author

Looks like a new static check was added in #20501 which checks syntax for 3.7+ and is attempting to remove the check for Python 3.6 in the typing inference logic here. Is dropping Python 3.6 support slated for 2.2.4? Or should I go ahead and remove the Python 3.6 check and assume it's no longer supported even though #20467 is not part of a release yet?

@potiuk
Copy link
Member

potiuk commented Dec 28, 2021

Looks like a new static check was added in #20501 which checks syntax for 3.7+ and is attempting to remove the check for Python 3.6 in the typing inference logic here. Is dropping Python 3.6 support slated for 2.2.4? Or should I go ahead and remove the Python 3.6 check and assume it's no longer supported even though #20467 is not part of a release yet?

Interesting one that we have not foreseen. I think we should disable the pyupgrade for now and re-enable it when we release 2.3.0.

@josh-fell josh-fell mentioned this pull request Dec 29, 2021
@josh-fell
Copy link
Contributor Author

@potiuk Would you like me to open a PR to disable the pyupgrade check or were there other conversations on handling this edge case until the 2.3 release?

Comment on lines +122 to +125
@task_decorator
def identity_dict(x: int, y: int) -> eval(test_return_annotation):
return {"x": x, "y": y}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible/easy to also handle the stringified annotation, e.g.

@task_decorator
def identity_dict(x: int, y: int) -> "dict":
    return {}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't see why not. I'll dig in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like typing.get_type_hints() does the eval automatically and can be useful here (to replace inspect.signature).

Copy link
Contributor Author

@josh-fell josh-fell Jan 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell we eventually get into the same __origin__ vs __extra__ attribute situation for Python 3.6 unfortunately. We can definitely add some tests for the stringified dict typing though.

Also looks like the recent refactor of airflow/decorators/base.py uses inspect.signature a little more extensively too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python 3.6

Haven’t we dropped it already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it won't be dropped until 2.3 but this PR is slated for 2.2.4.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be viable to do de-stringification if 3.6 is not considered? We can add this in a separate PR that targets only 2.3+ if so.

@potiuk
Copy link
Member

potiuk commented Jan 6, 2022

@potiuk Would you like me to open a PR to disable the pyupgrade check or were there other conversations on handling this edge case until the 2.3 release?

Disabling - no but changing it back to 3.6 syntax (you can do it in this PR) with a note that it should be bumed to 3.7 when we release 2.3.0

@josh-fell josh-fell force-pushed the taskflow-api-multiple-output-inference branch from 0b036ea to d7378a9 Compare January 7, 2022 04:27
@potiuk potiuk merged commit 4198550 into apache:main Jan 10, 2022
@josh-fell josh-fell deleted the taskflow-api-multiple-output-inference branch January 10, 2022 19:37
jedcunningham pushed a commit that referenced this pull request Jan 21, 2022
@jedcunningham jedcunningham added the type:bug-fix Changelog: Bug Fixes label Feb 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full tests needed We need to run full set of tests for this PR to merge type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TaskFlow API multiple_outputs inference not handling all flavors of dict typing
5 participants