Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make C extension lazy-import #971

Merged
merged 2 commits into from
May 30, 2019
Merged

make C extension lazy-import #971

merged 2 commits into from
May 30, 2019

Conversation

soumith
Copy link
Member

@soumith soumith commented May 30, 2019

This allows one to use torchvision models via hub, and defers C extension loading only when you invoke a particular op that calls C extensions.

Arguably, this is worse wrt user experience, and I think we should improve it by atleast invoking lazy-import when ops are used in the constructors (but there's no easy / obvious way to do that for functional bits which are what are written in _C right now).

This also unblocks torch.hub to be able to load models without failing on _C loading, because the models themselves are self-contained.

Follow-up tasks wrt torch.hub are:

  • change hubconf.py to make sure:
    • torchvision is not already loaded into the python session
    • if it is loaded, the version loaded matches the version being hub-sideloaded -- if not raise a Warning or error

Follow-up tasks wrt _C loading are:

  • currently the CUDA checks are rarely actually hit because _C loading fails before we hit the if hasattr(_C, "CUDA_VERSION") on something like libcudart.so.9.0 not found

cc: @ailzhang @fmassa

Copy link
Contributor

@ailzhang ailzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!!
Nit: torchvision/ops/roi_pool.py should also have exactly the same change as torchvision/ops/roi_align.py.
I've tested hub with that change added and confirmed it works.

@codecov-io
Copy link

codecov-io commented May 30, 2019

Codecov Report

Merging #971 into master will increase coverage by 0.02%.
The diff coverage is 48.27%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #971      +/-   ##
=========================================
+ Coverage   61.07%   61.1%   +0.02%     
=========================================
  Files          64      65       +1     
  Lines        5082    5091       +9     
  Branches      763     764       +1     
=========================================
+ Hits         3104    3111       +7     
- Misses       1767    1769       +2     
  Partials      211     211
Impacted Files Coverage Δ
torchvision/__init__.py 62.5% <ø> (+20.07%) ⬆️
torchvision/ops/boxes.py 94.73% <100%> (+0.14%) ⬆️
torchvision/extension.py 38.09% <38.09%> (ø)
torchvision/ops/roi_pool.py 67.44% <66.66%> (-0.86%) ⬇️
torchvision/ops/roi_align.py 65.95% <66.66%> (-0.71%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 579eebe...6a75060. Read the comment docs.

@fmassa
Copy link
Member

fmassa commented May 30, 2019

So hub currently doesn't support loading detection models, or models with custom ops?
This makes me wonder if moving the Cpp ops to Pytorch wouldn't be a safer approach?

@soumith
Copy link
Member Author

soumith commented May 30, 2019

hub supports both, we just have to update logic for them. see my follow-up tasks above.

This makes me wonder if moving the Cpp ops to Pytorch wouldn't be a safer approach?

Fixing a packaging problem is the solution -- not making a mono-repo.

@soumith soumith merged commit 220b69b into master May 30, 2019
@fmassa fmassa deleted the cextension_lazy branch May 30, 2019 22:47
@ailzhang
Copy link
Contributor

ailzhang commented May 31, 2019 via email

facebook-github-bot pushed a commit to pytorch/pytorch that referenced this pull request May 31, 2019
Summary:
This should pass once pytorch/vision#971 is merged.
To remove torchvision as baseline, we just compare to sum of all param.sum() in pretrained resnet18 model, which means we need to manually update the number only when that pretrained weights are changed, which is generally rare.
Pull Request resolved: #21132

Differential Revision: D15563078

Pulled By: ailzhang

fbshipit-source-id: f28c6874149a1e6bd9894402f6847fd18f38b2b7
@t-vi
Copy link
Contributor

t-vi commented Jul 11, 2019

Just a quick shout that the lazy import mechanism as currently done is likely not a good fit for scriptable ops.
Maybe the lazy import could be done "earlier" to happen when you actually use a function wrapping the autograd.Function, though magically changing the module (similar to torch.backends.*) is evil, too, it might just integrate a tad better.

@fmassa
Copy link
Member

fmassa commented Jul 11, 2019

@t-vi This is a good point. We are currently looking into ways of making some ops scriptable in torchvision, so this is something that we need to fix at some point.

cc @fbbradheintz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants