Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyflyte run imperative workflows #1131

Merged
merged 8 commits into from
Aug 29, 2022
Merged

pyflyte run imperative workflows #1131

merged 8 commits into from
Aug 29, 2022

Conversation

pingsutw
Copy link
Member

@pingsutw pingsutw commented Aug 12, 2022

TL;DR

support running imperative workflow by using pyflyte run.
flyteorg/flyte#2706

we try to get the module name from extract_task_module, but it doesn't work for imperative workflow.
You will get the below error when using pyflyte run to run an imperative workflow.

(flytekit-3.9) ➜  flytekit git:(master) ✗ pyflyte --config ~/.flyte/config-remote.yaml run --remote example_imperative.py wf --in1 2 --in2 3
/Users/kevin/opt/anaconda3/envs/flytekit-3.9/lib/python3.9/site-packages/paramiko/transport.py:236: CryptographyDeprecationWarning: Blowfish has been deprecated
  "class": algorithms.Blowfish,
Traceback (most recent call last):
  File "/Users/kevin/opt/anaconda3/envs/flytekit-3.9/bin/pyflyte", line 8, in <module>
    sys.exit(main())
  File "/Users/kevin/opt/anaconda3/envs/flytekit-3.9/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/Users/kevin/opt/anaconda3/envs/flytekit-3.9/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/Users/kevin/opt/anaconda3/envs/flytekit-3.9/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/kevin/opt/anaconda3/envs/flytekit-3.9/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/kevin/opt/anaconda3/envs/flytekit-3.9/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/kevin/opt/anaconda3/envs/flytekit-3.9/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/kevin/opt/anaconda3/envs/flytekit-3.9/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/Users/kevin/git/flytekit/flytekit/clis/sdk_in_container/run.py", line 539, in _run
    remote_entity = remote.register_script(
  File "/Users/kevin/git/flytekit/flytekit/remote/remote.py", line 596, in register_script
    upload_location, md5_bytes = fast_register_single_script(
  File "/Users/kevin/git/flytekit/flytekit/tools/script_mode.py", line 102, in fast_register_single_script
    _, mod_name, _, script_full_path = extract_task_module(wf_entity)
  File "/Users/kevin/git/flytekit/flytekit/core/tracker.py", line 237, in extract_task_module
    name = f.__name__.split(".")[-1]
AttributeError: 'ImperativeWorkflow' object has no attribute '__name__'

we can't also use inspect.getmodule to get module name because inspect.getmodule(<imperative_wf>).__name__ is always equal to flytekit.core.workflow

basically, it's unnecessary to get a module name from an entity by calling extract_task_module because we already have a module name that users specify in the pyflyte run command.

To address this issue, we can use the module name passed in the pyflyte run instead

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

How did you fix the bug, make the feature etc. Link to any design docs etc

Tracking Issue

flyteorg/flyte#2706

Follow-up issue

NA

Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
@codecov
Copy link

codecov bot commented Aug 15, 2022

Codecov Report

Merging #1131 (7067721) into master (73eaad1) will increase coverage by 0.12%.
The diff coverage is 80.85%.

@@            Coverage Diff             @@
##           master    #1131      +/-   ##
==========================================
+ Coverage   68.26%   68.38%   +0.12%     
==========================================
  Files         287      288       +1     
  Lines       25829    25963     +134     
  Branches     2885     2899      +14     
==========================================
+ Hits        17633    17756     +123     
- Misses       7719     7728       +9     
- Partials      477      479       +2     
Impacted Files Coverage Δ
flytekit/remote/remote.py 41.11% <ø> (ø)
flytekit/tools/script_mode.py 49.25% <0.00%> (+0.72%) ⬆️
flytekit/core/tracker.py 48.67% <60.00%> (+0.04%) ⬆️
tests/flytekit/unit/cli/pyflyte/imperative_wf.py 86.36% <86.36%> (ø)
flytekit/clis/sdk_in_container/constants.py 100.00% <100.00%> (ø)
flytekit/clis/sdk_in_container/run.py 83.60% <100.00%> (ø)
tests/flytekit/unit/cli/pyflyte/test_run.py 98.87% <100.00%> (+0.06%) ⬆️
flytekit/extras/tasks/shell.py 77.71% <0.00%> (-0.45%) ⬇️
...ests/flytekit/unit/core/test_structured_dataset.py 98.88% <0.00%> (-0.31%) ⬇️
flytekit/core/data_persistence.py 33.80% <0.00%> (-0.01%) ⬇️
... and 13 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Signed-off-by: Kevin Su <[email protected]>
@wild-endeavor
Copy link
Contributor

will take a look tomorrow

) -> (_data_proxy_pb2.CreateUploadLocationResponse, bytes):
_, mod_name, _, script_full_path = extract_task_module(wf_entity)
Copy link
Member Author

@pingsutw pingsutw Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't get the module name of the imperative workflow by using inspect.getmodule because inspect.getmodule(<imperative_wf>).__name__ is always equal to flytekit.core.workflow

To address this issue, we can use the module name passed in the pyflyte run instead.
pyflyte run --remote imperative_wf.py (model_name) wf

Signed-off-by: Kevin Su <[email protected]>
wild-endeavor
wild-endeavor previously approved these changes Aug 27, 2022
Copy link
Contributor

@wild-endeavor wild-endeavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah sounds good, but can we make it like pyflyte register and pyflyte package in the future? can we make another issue for that? That will get us around this problem and maybe we can revert this change in the future.

you tested this in a subfolder right? like if you have a/b/c/wf.py running pyflyte run in the c/ folder, still gets you a.b.c.wf as the name.

Signed-off-by: Kevin Su <[email protected]>
@pingsutw
Copy link
Member Author

yes, It's working in the subfolder as well. I've tested it.

can we make it like pyflyte register and pyflyte package in the future?

Not sure what you mean. The reason why it only doesn't work in the pyflyte run is that mod_name in the compress_single_script is generated from extract_task_module, but extract_task_module doesn't work for the imperative workflow. Therefore, we have to use the module name passed from the pyflyte run command.

On the other hand, the pyflyte register uses fast_package, which doesn't require module name. That's why the pyflyte register works now.

@wild-endeavor wild-endeavor merged commit 9792893 into master Aug 29, 2022
eapolinario pushed a commit that referenced this pull request Sep 15, 2022
eapolinario added a commit that referenced this pull request Sep 16, 2022
* Add deck to papermill plugin task (#1111)

Signed-off-by: Calvin Leather <[email protected]>

* Run compilation even in local execution for dynamic tasks to early detect errors (#1121)

Signed-off-by: Yee Hing Tong <[email protected]>

* Set to pyflyte run blob object remote when dealing with remote files (#1128)

Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Eduardo Apolinario <[email protected]>

* Override voidPromise resource (#1127)

* override void promise resource

Signed-off-by: Kevin Su <[email protected]>

* override void promise resource

Signed-off-by: Kevin Su <[email protected]>

* Fix how ShellTask retrieves the Pod class name (#1132)

* Fix how ShellTask retrieves the Pod class name

Signed-off-by: Matheus Moreno <[email protected]>

* Set Pod class name as a constant

Signed-off-by: Matheus Moreno <[email protected]>

* Revert last commit

Signed-off-by: Matheus Moreno <[email protected]>

* Execute automatic linting

Signed-off-by: Matheus Moreno <[email protected]>

Signed-off-by: Matheus Moreno <[email protected]>

* Add restriction for pandas to be >=1.2 for fsspec plugin (#1136)

Signed-off-by: Yee Hing Tong <[email protected]>

* Use joblib hashing to generate cache key to ensure repeatability (#1126)

* cherry pick 97b454b

Signed-off-by: Yee Hing Tong <[email protected]>

* requirements

Signed-off-by: Yee Hing Tong <[email protected]>

* Fix usage of save in ProtoJoblibHasher

Signed-off-by: Eduardo Apolinario <[email protected]>

* Regenerate requirements using python 3.7

Signed-off-by: Eduardo Apolinario <[email protected]>

* Add test_stable_cache_key

Signed-off-by: Eduardo Apolinario <[email protected]>

Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Eduardo Apolinario <[email protected]>
Co-authored-by: Eduardo Apolinario <[email protected]>

* Allow None protocol to mean all data persistence supported storage options in Structured Dataset (#1134)

Signed-off-by: Yee Hing Tong <[email protected]>

* handle ImportError and OSError in extras.pytorch (#1141)

* handle ImportError and OSError in extras.pytorch

Signed-off-by: Niels Bantilan <[email protected]>

* isolate exception to torch import

Signed-off-by: Niels Bantilan <[email protected]>

Signed-off-by: Niels Bantilan <[email protected]>

* Register dataframe renderers in structured dataset (#1140)

* Register dataframe renderers in structured dataset

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* fix test

Signed-off-by: Kevin Su <[email protected]>

* more tests

Signed-off-by: Kevin Su <[email protected]>

Signed-off-by: Kevin Su <[email protected]>

* pyflyte run imperative workflows (#1131)

Signed-off-by: Kevin Su <[email protected]>

* Using sidecar handler to run Papermill task (#1143)

* remove nb prefix

Signed-off-by: Kevin Su <[email protected]>

* add tests

Signed-off-by: Kevin Su <[email protected]>

* Update requirements.in

Signed-off-by: Kevin Su <[email protected]>

* remove _

Signed-off-by: Kevin Su <[email protected]>

Signed-off-by: Kevin Su <[email protected]>

* Properly raise error in NumpyArrayTransformer (#1146)

Signed-off-by: Rahul Mehta <[email protected]>

Signed-off-by: Rahul Mehta <[email protected]>

* Add assert_type in dataclass transformer (#1149)

* Add assert_type in dataclassTransformer

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* more tests

Signed-off-by: Kevin Su <[email protected]>

* fix lint

Signed-off-by: Kevin Su <[email protected]>

* Add one more test

Signed-off-by: Kevin Su <[email protected]>

Signed-off-by: Kevin Su <[email protected]>

* Pickle in Union Type (#1147)

* Pickel in Union type

Signed-off-by: Kevin Su <[email protected]>

* Pickel in Union type

Signed-off-by: Kevin Su <[email protected]>

* wip

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* update tests

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* Address comment

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

Signed-off-by: Kevin Su <[email protected]>

* Bump max docker version to 7.0.0 (#1138)

Signed-off-by: Rahul Mehta <[email protected]>

Signed-off-by: Rahul Mehta <[email protected]>

* Set flytekit<2.0 in plugins (#1152)

Signed-off-by: Eduardo Apolinario <[email protected]>

Signed-off-by: Eduardo Apolinario <[email protected]>
Co-authored-by: Eduardo Apolinario <[email protected]>

* Add literal type to union literal (#1144)

* Add literal type to union literal

Signed-off-by: Kevin Su <[email protected]>

* fix test

Signed-off-by: Kevin Su <[email protected]>

* Add tests

Signed-off-by: Kevin Su <[email protected]>

* more tests

Signed-off-by: Kevin Su <[email protected]>

Signed-off-by: Kevin Su <[email protected]>

* Fix the type of optional[int] in nested dataclass (#1148)

* Fix the type of optional[int] in nested dataclass

Signed-off-by: Kevin Su <[email protected]>

* update tests

Signed-off-by: Kevin Su <[email protected]>

* update comments

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

Signed-off-by: Kevin Su <[email protected]>

* Added symlink dereferencing in fast packaging and tests (#1151)

* Added symlink dereferencing and tests

Signed-off-by: Vanshika Chowdhary <[email protected]>

* Added flag to register as well

Signed-off-by: Vanshika Chowdhary <[email protected]>

* More flag propagation

Signed-off-by: Vanshika Chowdhary <[email protected]>

Signed-off-by: Vanshika Chowdhary <[email protected]>
Co-authored-by: Vanshika Chowdhary <[email protected]>

* Strip newline from client secret (#1163)

* Strip newline from client secret

* Add logging and rework the secret file comparison to work on windows

Signed-off-by: Eduardo Apolinario <[email protected]>

Signed-off-by: Eduardo Apolinario <[email protected]>
Co-authored-by: Eduardo Apolinario <[email protected]>

* Fix the type of optional[int] in dataclass (#1135)

Signed-off-by: Kevin Su <[email protected]>

* fix: plugins/flytekit-papermill/dev-requirements.txt to reduce vulnerabilities (#1154)

The following vulnerabilities are fixed by pinning transitive dependencies:
- https://snyk.io/vuln/SNYK-PYTHON-OAUTHLIB-3021142
- https://snyk.io/vuln/SNYK-PYTHON-PYSPARK-3021131

Signed-off-by: Eduardo Apolinario <[email protected]>

* Using sidecar handler to run Papermill task (#1143)

* remove nb prefix

Signed-off-by: Kevin Su <[email protected]>

* add tests

Signed-off-by: Kevin Su <[email protected]>

* Update requirements.in

Signed-off-by: Kevin Su <[email protected]>

* remove _

Signed-off-by: Kevin Su <[email protected]>

Signed-off-by: Kevin Su <[email protected]>

* fix: plugins/flytekit-papermill/dev-requirements.txt to reduce vulnerabilities (#1145)

The following vulnerabilities are fixed by pinning transitive dependencies:
- https://snyk.io/vuln/SNYK-PYTHON-COOKIECUTTER-2414281

* Bump pyspark from 3.2.1 to 3.2.2 in /plugins/flytekit-papermill (#1130)

Bumps [pyspark](https://github.com/apache/spark) from 3.2.1 to 3.2.2.
- [Release notes](https://github.com/apache/spark/releases)
- [Commits](apache/spark@v3.2.1...v3.2.2)

---
updated-dependencies:
- dependency-name: pyspark
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: plugins/flytekit-papermill/dev-requirements.txt to reduce vulnerabilities (#1154)

The following vulnerabilities are fixed by pinning transitive dependencies:
- https://snyk.io/vuln/SNYK-PYTHON-OAUTHLIB-3021142
- https://snyk.io/vuln/SNYK-PYTHON-PYSPARK-3021131

Signed-off-by: Calvin Leather <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Eduardo Apolinario <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Matheus Moreno <[email protected]>
Signed-off-by: Niels Bantilan <[email protected]>
Signed-off-by: Rahul Mehta <[email protected]>
Signed-off-by: Vanshika Chowdhary <[email protected]>
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Calvin Leather <[email protected]>
Co-authored-by: Yee Hing Tong <[email protected]>
Co-authored-by: Kevin Su <[email protected]>
Co-authored-by: Matheus Moreno <[email protected]>
Co-authored-by: Eduardo Apolinario <[email protected]>
Co-authored-by: Niels Bantilan <[email protected]>
Co-authored-by: Rahul Mehta <[email protected]>
Co-authored-by: Vanshika Chowdhary <[email protected]>
Co-authored-by: Vanshika Chowdhary <[email protected]>
Co-authored-by: Snyk bot <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants