Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [Flytekit] Mismatching Type while serializing Union Types in _make_dataclass_serializable #5910

Open
2 tasks done
mao3267 opened this issue Oct 24, 2024 · 2 comments · May be fixed by flyteorg/flytekit#2859
Open
2 tasks done
Assignees
Labels
bug Something isn't working flytekit FlyteKit Python related issue

Comments

@mao3267
Copy link

mao3267 commented Oct 24, 2024

Describe the bug

Mentioned in the discussion under flyteorg/flytekit#2823. When handling Union types in _make_dataset_serializable, we intend to select the first argument in get_args(python_type) because Optional is equivalent to Union[ExpectedType, None]. However, this approach may fail if the type order differs, for example, in Union[None, ExpectedType].

An error reproduction example:

from dataclasses import dataclass
from flytekit import task
from flytekit.types.file import FlyteFile
from typing import Union


@dataclass
class InnerDC:
    ff: Union[None, FlyteFile]

@dataclass
class DC:
    inner_dc: InnerDC


@task
def t_dc() -> DC:
    return DC(inner_dc=InnerDC(ff="s3://path"))

Expected behavior

This code snippet should work, meaning that the underlying Union types in dataclasses should be serialized correctly.

Additional context to reproduce

No response

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@mao3267 mao3267 added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Oct 24, 2024
@mao3267
Copy link
Author

mao3267 commented Oct 24, 2024

Here is another example I have come up with:

from dataclasses import dataclass
from flytekit import task
from typing import Union

@dataclass
class InnerDC:
    ff: Union[None, int, str]

@dataclass
class DC:
    inner_dc: InnerDC

@task
def t_dc() -> DC:
    return DC(inner_dc=InnerDC(ff="string"))

Since Union can contain multiple type candidates, I’m wondering if we should consider supporting this as well?
cc @Future-Outlier @wild-endeavor

@Future-Outlier
Copy link
Member

Here is another example I have come up with:

from dataclasses import dataclass
from flytekit import task
from typing import Union

@dataclass
class InnerDC:
    ff: Union[None, int, str]

@dataclass
class DC:
    inner_dc: InnerDC

@task
def t_dc() -> DC:
    return DC(inner_dc=InnerDC(ff="string"))

Since Union can contain multiple type candidates, I’m wondering if we should consider supporting this as well? cc @Future-Outlier @wild-endeavor

YES DO IT

@eapolinario eapolinario removed the untriaged This issues has not yet been looked at by the Maintainers label Oct 24, 2024
@eapolinario eapolinario added the flytekit FlyteKit Python related issue label Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flytekit FlyteKit Python related issue
Projects
Status: Backlog
Development

Successfully merging a pull request may close this issue.

3 participants