Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(torchx/components) Fix entrypoint loading to deal with deferred load… #695

Merged
merged 1 commit into from
Feb 26, 2023

Conversation

kiukchung
Copy link
Collaborator

@kiukchung kiukchung commented Feb 26, 2023

Fixes a bug + enhances custom component registration.

  1. [bugfix] torchx not allowing components to be registered via [torchx.components] entrypoint because of a change in utils.entrypoints.load_entrypoints() that defers the entrypoint load by wrapping the entrypoints with a _defer_load_fn() but assumes that all entrypoints are functions (e.g. foo.bar:baz) and does not handle modules (e.g. foo.bar). Adds graceful handling of module entrypoints.

  2. [enhancement] Allows users to specify an "empty" group (e.g. no prefix) for component names by using the _* "special" prefix in the alias key (similar to "hidden" variables in python). Example

    [torchx.components]
     _0 = torchx.components.dist
     _1 = torchx.components.utils
    

    names dist.ddp as just ddp and utils.python as just python

  3. [change] If custom components are registered via entrypoints, skips loading builtins. Users can still load the builtins by adding an extra line in their entrypoint as:

    [torchx.components]
    foo = bar.baz
    _ = torchx.components
    

    NOTE: this is a BC incompatible change, HOWEVER given that component registration via entrypoints was broken, I'm assuming that no one was registering their own components.

  4. [docs] Reflected the changes above in the docs page

  5. [todo fix] resolves [file-linter] Refactor file-linter and components-finder #261 since we no longer have components_base.py file
    Test plan:


Added unittests + revised unittests for utils.entrypoints and specs.finder

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 26, 2023
@d4l3k
Copy link
Member

d4l3k commented Feb 26, 2023

@kiukchung lots of CI failures (docs, pyre, unit) are the relevant ones I think

Copy link
Member

@d4l3k d4l3k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM



When you register your own components, TorchX will not include its own builtins. To add TorchX's
builtin components you must specify another entry as:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may want to mention you can do this in the config as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the one supported by .torchxconfig is super weird (I want to deprecate it in favor of supporting "alias" -> "module_name" style custom components from .torchxconfig). Currently what you can do is to have a section like

# .torchxconfig
[torchx.file]
get_file_contents = my.custom.file_reader:read

And when you run

torchx run my/custom.py:component_fn

We call my.custom.file_reader.read("my/custom.py") which we expect to load the component_fn into the global namespace. Took me a while to wrap my head around it, so I don't expect anyone to actually use it.

Instead I'd like to support something like this from .torchxconfig:

# .torchxconfig
[torchx.components]
foo = my.custom.component
_ = torchx.components

which would be the same as doing

# setup.py

entrypoints = {
  "torchx.components": [
        "foo = my.custom.component",
        "_ = torchx.components",
   ],
}

entry_points={
"torchx.components": [
"foo = my_project.bar",
"torchx = torchx.components",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this now works recursively? got it, interesting OK

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was always loading recursively (by file not by module though)

entry_points={
"torchx.components": [
"_0 = my_project.bar",
"_1 = torchx.components",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to drop the dist. and utils. prefixes for TorchX components now? not a blocker for this PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah so we could drop the "module" name from the builtin components (as discussed offline we'd have to keep the alias dist.*,util.* etc while hiding them from torchx builtins). Should be easy to do now since we can make the builtins load:

ModuleComponentsFinder("torchx.components.dist", group="")
ModuleComponentsFinder("torchx.components.utils", group="")
# ... and so on...

@kiukchung kiukchung force-pushed the dev-kiuk branch 2 times, most recently from f524756 to 0e0eab0 Compare February 26, 2023 20:39
@codecov
Copy link

codecov bot commented Feb 26, 2023

Codecov Report

Merging #695 (9e30ed6) into main (c74bb9c) will increase coverage by 0.06%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #695      +/-   ##
==========================================
+ Coverage   92.39%   92.46%   +0.06%     
==========================================
  Files          83       86       +3     
  Lines        5670     5666       -4     
==========================================
  Hits         5239     5239              
+ Misses        431      427       -4     
Impacted Files Coverage Δ
torchx/specs/finder.py 99.33% <100.00%> (+2.34%) ⬆️
torchx/specs/test/components/a/__init__.py 100.00% <100.00%> (ø)
torchx/specs/test/components/a/b/c.py 100.00% <100.00%> (ø)
torchx/specs/test/components/c/d.py 100.00% <100.00%> (ø)
torchx/util/entrypoints.py 100.00% <100.00%> (ø)
torchx/components/dist.py 96.36% <0.00%> (-0.07%) ⬇️
torchx/schedulers/kubernetes_mcad_scheduler.py 94.02% <0.00%> (-0.02%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@kiukchung kiukchung force-pushed the dev-kiuk branch 3 times, most recently from c675569 to 6cf97e9 Compare February 26, 2023 22:14
…ing of modules to enable component registration to work properly
@kiukchung
Copy link
Collaborator Author

kiukchung commented Feb 26, 2023

Also resolves: #694 by passing os.getenv["CONTAINER_REPO"] as the container_repo argument to push_image() in scripts/kube_dist_trainer.py

@kiukchung kiukchung merged commit 084479f into main Feb 26, 2023
@kiukchung kiukchung deleted the dev-kiuk branch February 26, 2023 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[file-linter] Refactor file-linter and components-finder
3 participants