-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streamline fast register options with new flag #2690
Conversation
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2690 +/- ##
==========================================
+ Coverage 91.99% 94.67% +2.68%
==========================================
Files 18 87 +69
Lines 874 4113 +3239
==========================================
+ Hits 804 3894 +3090
- Misses 70 219 +149 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a first past review.
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
@@ -58,28 +100,74 @@ def fast_package( | |||
ignores = default_ignores | |||
ignore = IgnoreGroup(source, ignores) | |||
|
|||
# Remove this after original tar command is removed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we keeping the original tar command? Is it for backward compatibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup.
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Background
Flytekit has a notion of a "fast register" (additional docs) which copies a set of files from the local host, compresses and uploads them, and instructs tasks registered to download and extract those files before execution.
Current UX
The behavior today and how the user invokes the behavior is as follows
pyflyte run
pyflyte run
was initially written as a way for users to quickly run one script. The fast registration aspect of this command copied this one script into the zip file and into the final container. Over time, support for additional Python files was added via the--copy-all
flag. Setting this flag copies all unignored files from the detected root folder into the archive.More recently, default behavior for
run
was updated to automatically detect and include all local Python files that were loaded in the course of loading the target script.pyflyte register
pyflyte register
by default runs in 'fast' mode as well. By default, it behaves like the--copy-all
flag inrun
, copying all unignored files from the detected root folder.To turn off this behavior, the
--non-fast
flag was added.pyflyte package
pyflyte package
by default does not run in 'fast' mode. That is by default, there are no files copied/extracted.To enable fast packaging,
--fast
must be added.What changes were proposed in this pull request?
The differences in behavior, the differences in switches, and the fact that a new way to more intelligently detect which files should be copied means we have the opportunity to streamline the ux here.
Aligned UX
This PR introduces a
--copy
flag (listed currently as a beta) that will deprecate the current set of flags. For now, both sets of flags can be used.In short, use
--copy all
for the current fast register behavior (copying all non-ignored files)--copy auto
for the new behavior of copying only loaded Python modules.--copy none
to not copy any files. This is basically turning off fast registration entirely.Examples
Example use of old and new flag. Again, all old flags will continue to work. In the examples, the
--copy
switch is explicitly written to a) just highlight what the default is and b) if specified, the logic is slightly different than if not specifying it (see the discussion on changes below).Run
To smart-copy just loaded python modules (the default)
Old:
pyflyte run a/c/d.py
New:
pyflyte run [--copy auto] a/c/d.py
To copy all files
Old:
pyflyte run --copy-all a/c/d.py
New:
pyflyte run --copy all a/c/d.py
Register
To copy all unignored files (the default)
Old:
pyflyte register a/b,a/c/d.py
New:
pyflyte register [--copy all] a/b,a/c/d.py
To smart copy only relevant Python modules
pyflyte register --copy auto a/b,a/c/d.py
To turn off fast register completely:
Old:
pyflyte register --non-fast a/b,a/c/d.py
New:
pyflyte register --copy none a/b,a/c/d.py
Package
To package normally with no files copied (the default)
Old:
pyflyte --pkgs a.b,c.d package
New: `pyflyte --pkgs a.b,c.d package [--copy none]
To package with all files:
Old:
pyflyte --pkgs a.b,c.d package --fast
New:
pyflyte --pkgs a.b,c.d package --copy all
To package with only loaded Python modules:
pyflyte --pkgs a.b,c.d package --copy auto
Debugging
To help users understand what files are slated to be copied, each command also takes a
--ls-files
flag.Serialize
pyflyte serialize
is missing from the above discussions because it is for the most part a duplicate of thepackage
command. We are planning to deprecateserialize
actually.Changes
To make the new flag possible, especially in the context of the
auto
behavior of detecting which modules are loaded, this pr had toregister
which previously had uploaded files first.Note
If you use the default/use the old flags, the old tar logic will be triggered. Only if you use the new
--copy
flag will the new logic for tar be used.Note
This means that your code can no longer depend on the result of the upload. We think this is an extreme edge case, but let us know if not.
Code updates
serialize
function was split up into two functions to allow pyflyte to first load python modules, then upload files, then actually compile the Flyte entities.load_packages_and_modules
renamed tolist_packages_and_modules
because it no longer loads anything.FastPackageOptions
class to capture the new behavior.How was this patch tested?
Setup process
Screenshots
Check all the applicable boxes
Related PRs
Docs link