Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] remote_source lost on serialization of @dataclass_json with FlyteFile #1938

Closed
2 tasks done
nicklofaso opened this issue Dec 12, 2021 · 1 comment
Closed
2 tasks done
Assignees
Labels
bug Something isn't working flytekit FlyteKit Python related issue

Comments

@nicklofaso
Copy link
Contributor

Describe the bug

A @dataclass_json containing a FlyteDirectory or FlyteFile loses its remote_source information when a @task is called from a @dynamic. See example code below

Steps to reproduce

  • Use version 0.25.0 of flytekit
  • Create @dataclass_json class with a FlyteFile or FlyteDirectory field
  • initialize the dataclass FlyteFile with "gs://mybucket/blah" or "s3://mybucket/blah"
  • Create a dynamic class that calls a task and passes in the dataclass

Expected behavior

The FlyteDirectory and FlyteFile remote_source field should not be erased when the @task is called from the @dynamic

Additional context to reproduce

from dataclasses import dataclass
from dataclasses_json import dataclass_json

from flytekit import workflow, dynamic, task
from flytekit.types.directory.types import FlyteDirectory
from flytekit.types.file import FlyteFile


@dataclass_json
@dataclass
class NestedData:
    fdir: FlyteDirectory
    ffile: FlyteFile


@task
def tsk(my_dir: FlyteDirectory, my_nested: NestedData):
    print("tsk my_dir.path", my_dir.path)
    print("tsk my_dir.remote_source", my_dir.remote_source)
    print("tsk my_nested.fdir.path", my_nested.fdir.path)
    print("tsk my_nested.fdir.remote_source", my_nested.fdir.remote_source)
    print("tsk my_nested.ffile.path", my_nested.ffile.path)
    print("tsk my_nested.ffile.remote_source", my_nested.ffile.remote_source)
    assert my_dir.remote_source
    assert my_nested.fdir.remote_source
    assert my_nested.ffile.remote_source


@dynamic
def dyn(my_dir: FlyteDirectory, my_nested: NestedData):
    print("dyn my_dir.path", my_dir.path)
    print("dyn my_dir.remote_source", my_dir.remote_source)
    print("dyn my_nested.fdir.path", my_nested.fdir.path)
    print("dyn my_nested.fdir.remote_source", my_nested.fdir.remote_source)
    print("dyn my_nested.ffile.path", my_nested.ffile.path)
    print("dyn my_nested.ffile.remote_source", my_nested.ffile.remote_source)
    assert my_dir.remote_source
    assert my_nested.fdir.remote_source
    assert my_nested.ffile.remote_source

    # Nested FlyteDir and FlyteFile lose remote_source when passed to
    # @task from a @dynamic
    tsk(my_dir=my_dir, my_nested=my_nested)


@workflow
def wf(my_dir: FlyteDirectory, my_nested: NestedData):
    # This fails
    dyn(my_dir=my_dir, my_nested=my_nested)

    # This works
    # tsk(my_dir=my_dir, my_nested=my_nested)


def main():
    my_dir = FlyteDirectory("gs://mybucket/configs/")
    my_nested = NestedData(
        fdir=FlyteDirectory("gs://mybucket/test"), ffile=FlyteFile("gs://mybucket/test.txt")
    )
    wf(my_dir=my_dir, my_nested=my_nested)


if __name__ == "__main__":
    main()

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@nicklofaso nicklofaso added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Dec 12, 2021
@kumare3
Copy link
Contributor

kumare3 commented Dec 13, 2021

Cc @pingsutw

@wild-endeavor wild-endeavor added flytekit FlyteKit Python related issue and removed untriaged This issues has not yet been looked at by the Maintainers labels Dec 15, 2021
@wild-endeavor wild-endeavor added this to the 0.19.0 - Eagle milestone Dec 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flytekit FlyteKit Python related issue
Projects
None yet
Development

No branches or pull requests

4 participants