Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core Feature] Flytekit local caching #761

Closed
wild-endeavor opened this issue Feb 22, 2021 · 3 comments
Closed

[Core Feature] Flytekit local caching #761

wild-endeavor opened this issue Feb 22, 2021 · 3 comments
Assignees
Labels
enhancement New feature or request flytekit FlyteKit Python related issue
Milestone

Comments

@wild-endeavor
Copy link
Contributor

Motivation: Why do you think this is important?
Flytekit local caching

Goal: What should the final outcome look like, ideally?
Unclear, but probably a local file or some other state holding mechanism that flytekit can interact with between python processes that will respect output caching with the same semantics as data catalog. Should also offer an easy way for users to reset this state. Maybe print its contents.

Describe alternatives you've considered
None.

@wild-endeavor wild-endeavor added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Feb 22, 2021
@wild-endeavor wild-endeavor added flytekit FlyteKit Python related issue and removed untriaged This issues has not yet been looked at by the Maintainers labels Feb 22, 2021
@katrogan
Copy link
Contributor

katrogan commented Mar 8, 2021

Additional context:

There are currently 3 execution modes. TASK_EXECUTION is used for tasks in workflows run on hosted Flyte.

A potential first step to implementing this issue might be to take on LOCAL_WORKFLOW_EXECUTION, that is, when a workflow is run end to end on a user's machine. Starting here simplifies implementation since this already handles converting inputs and outputs from python native types to Flyte literals. The local caching story could make use of the existing caching api, specifically: mirroring the CreateDataset and GetDataset interfaces.

Once LOCAL_WORKFLOW_EXECUTION is complete it might be easier to then move on to LOCAL_TASK_EXECUTION for running tasks locally.

@katrogan katrogan removed their assignment Apr 27, 2021
@CodeMySky
Copy link

I supports this, I think it lowers the barrier on new adopters to test caching locally more easily.

@kumare3
Copy link
Contributor

kumare3 commented Aug 17, 2021

@CodeMySky this has been checked in, you can use the pre-release or try it out with next Flyte release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request flytekit FlyteKit Python related issue
Projects
None yet
Development

No branches or pull requests

5 participants