RFC: initial stab at TorchScript fallback #7401

t-vi · 2021-02-03T11:25:20Z

This patch adds a support for calling TorchScript. This can be used fallback for when torch operators are not yet implemented or if one wants to incorporate bespoke PyTorch custom ops into TVM with ease.

It adds

a new relay torchop that takes a variable number of inputs and executes a provided TorchScript (aka PyTorch JIT) function,
a backend module class that calls into LibTorch (aka the C++ bindings underlying PyTorch), executing a TorchScript function.

The key addition to the relay infrastructure is that while leaving num_inputs == -1 on operator registration is documented to indicate a variable number of inputs, the type inference pass is not prepared to deal with it and instead requires the number of arguments provided to match the number of arguments with add_argument on operator registration. The proposed change to the type inference is then to match the declared arguments but to allow additional arguments if num_inputs is -1.

The other detail is that it uses a string attribute in the call node's attributes to take the serialized TorchScript. This is a bit fishy as the serialized representation is binary. I used serialization to get TorchScript into TVM at the C++ level as it is tricky to interoperate between PyBind-wrapped objects (TorchScript in PyTorch) and the TVM FFI, but we might pass things around as handles later (but I'm not sure if that works well with attributes). I would be glad to have your advice on this.

Currently I only support one output tensor and FP32, but that is straightforward to make flexible, and I would do this in parallel to the more fundamental discussion e.g. around the things above.

Even though this is in draft state, I'm opening this PR to make discussions concrete in terms of code.

I also posted a discussion to the forum.

masahi · 2021-02-03T15:24:34Z

This is an interesting use case of byoc, cc @zhiics @comaniac

masahi · 2021-02-03T21:15:07Z

I'm curious how it integrates with PyTorch frontend. Do we convert every op not supported to relay.torchop, run BYOC flow to get TorchScript subgraphs, and send them to libtorch? Sounds interesting!

t-vi · 2021-02-04T09:14:39Z

I'm curious how it integrates with PyTorch frontend. Do we convert every op not supported to relay.torchop, run BYOC flow to get TorchScript subgraphs, and send them to libtorch? Sounds interesting!

This is how I'd like it to work out. I've been thinking what the best "level" is and while the operator level might seem attractive, but I I'm not sure there is an easy way to run individual operators.
It won't help with out favourite inplace problems, though.

comaniac · 2021-02-04T19:00:18Z

I have the same question as masahi. IIUC, after this PR, the PyTorch frontend has the capablility to convert all unsupported ops to torchop so that we can guarantee the flow would work. This is an interesting idea and this would be the first BYOC use case that could potentially incopreate two BYOC backends (Torch and others like TensorRT).

cc @zhiics

t-vi · 2021-02-05T19:57:15Z

Yeah, the general idea is to use this as the fallback. I can add the fallback generation here in the PR if that is better.
Also I added a bit of a pro-con discussion regarding single op vs. program on the forum thread, if you have opinions, I'd be very grateful if you could chime in.

PhilippvK · 2021-05-23T11:15:09Z

Hey @t-vi, the idea of a fallback for unsupported TorchScript Ops is great. I am currently pursuing a similar approach for unsupported (and custom) TFLite Ops.

I also stumbled over the issue that num_inpust == -1 leads to problems in the type_infer step and "solved" it in a quite bad way by just packing all my inputs in a single Tuple and using num_inputs = 1. I would really appreciate getting at least your fix to solve this issue merged into upstream. Maybe in a separate PR at this is not really related to the TorchScript use case.

t-vi · 2021-05-31T12:27:54Z

I would really appreciate getting at least your fix to solve this issue merged into upstream. Maybe in a separate PR at this is not really related to the TorchScript use case.

I'm all for it, but I wouldn't know how to add tests in lieu of something using it. If you or @masahi has any opinion on this, I'd be glad to split it out...

t-vi · 2021-11-18T10:40:59Z

Just a quick note that when I tried to revive this back in the summer it got a bit stalled around pytorch/pytorch#65047 .

masahi · 2022-01-09T23:15:58Z

@t-vi Is this still WIP? I'm happy to merge whenever you think it is ready.

t-vi · 2022-01-10T06:51:40Z

@masahi So I had hoped to get the dlpack header version in PyTorch bumped (see the linked bug) but Facebook has internal uses that let it insist on the old one.
I wonder if we could work around it by providing a "dlpack-compat" header that defines the names for the types / fields? Or I could try to somehow separate the two.

masahi · 2022-01-10T07:11:08Z

I wonder if we could work around it by providing a "dlpack-compat" header

Does this mean the same thing as pytorch/pytorch#65047 (comment)? (which sounds good to me). Anyway it seems we cannot count on the PyTorch-side to change, so I'd say anything that can unblock you from our side should be a good plan. cc @tqchen @junrushao1994

masahi · 2022-01-10T07:20:26Z

As commented by the author of PyTorchTVM, apache/tvm-rfcs#25 (comment), many people are interested in this feature. Also people are actively talking about deeper integration of TVM into PyTorch workflow. So we should definitely land this.

As for me personally, I want to try out this feature to import models having fft-related ops (specifically, this cool model https://github.com/raoyongming/GFNet). FFT ops won't come to TVM any time soon.

t-vi · 2022-01-10T15:06:15Z

So I thought, I could wait it out, but I'll look into working around the version discrepancy in the next few weeks.

masahi · 2022-01-11T22:41:59Z

Another interesting use case this fallback could enable is mmdetection https://github.com/open-mmlab/mmdetection. It has a lot of cool detection models but rely on many custom ops that cannot convert to relay.

t-vi · 2022-01-21T16:58:02Z

M. Ruberry of the PyTorch team re-landed the update of the dlpack.h in PyTorch. If this still holds next week, it'll be exciting to bring this up to date. :)

t-vi · 2022-03-08T07:15:40Z

Hi, so I rebased this finally and it all compiles and runs one test against a current PyTorch master, so I think I'm back in business with this PR (unless it has been obsoleted, but from what I understand, the bridge is in the other direction).

masahi · 2022-03-08T07:55:48Z

Great! Can you remove WIP from the title now?

t-vi · 2022-03-08T07:58:43Z

Are you on the tvm discord or so to quickly discuss?

masahi · 2022-03-08T09:04:21Z

yeah I've just logged in.

src/relay/backend/contrib/libtorch/libtorch_codegen.cc

src/runtime/contrib/libtorch/libtorch_runtime.cc

t-vi · 2022-03-09T19:24:36Z

Thank you @masahi !

@masahi

* initial stab at TorchScript fallback * make types more flexible * Easy review bits. Thank you @masahi

sunggg · 2022-09-29T01:17:31Z

Hi, @t-vi. Thank you for the great contribution! This seems very interesting.
I'm trying to check out this feature but having some ABI link issue during setup - ld.lld: error: undefined symbol.
This seems like a known issue (link), so I followed their suggestion of adding -D_GLIBCXX_USE_CXX11_ABI=0".
However, this change makes TVM llvm codegen fails with ABI link.
Have you faced this issue before?
I'm wondering what will be the general guideline to fix such ABI conflicts.
Thank you!

For more detailed info, I installed pytorch package with conda and downloaded prebuilt binary by following the instruction.
Then, I modified my config.make to set(USE_LIBTORCH ~/libtorch).

t-vi · 2022-09-29T01:33:29Z

The fundamental problem is that (pre-compiled) PyTorch python modules use the pre C++-11 string ABI to better blend into the Python ecosystem or so. TVM does not, so it needs to link to LibTorch with the "new" C++ string ABI.
But these two versions clash.

One option is to use self-compiled PyTorch (which is what I do and so got us into this mess...).

I've been trying on and off to look into linking LibTorch such that all symbols are hidden (and you can load PyTorch with whatever C++ ABI later), but I have not managed yet.

sunggg · 2022-09-29T01:47:23Z

@t-vi , thank you for the quick reply!
Okay, I wanted to avoid compiling PyTorch by myself but seems like there is no other option now haha
Let me try! Appreciate it :)

shingjan · 2022-09-29T05:21:00Z

@sunggg This pr (#12232) used a different way to solve the version mismatch of c++ ABI between TVM and pytorch, maybe worth looking into.

tqchen added status: need review status: WIP labels Mar 3, 2021

tqchen assigned masahi Mar 3, 2021

t-vi mentioned this pull request Aug 27, 2021

[RFC]PyTorchTVM apache/tvm-rfcs#25

Merged

t-vi force-pushed the torchop branch from 8c4c3b5 to bd2c7d4 Compare September 15, 2021 09:28

t-vi requested review from anijain2305, areusch, comaniac, jroesch, junrushao, jwfromm, kazum, liangfu, manupak, MarisaKirisame, masahi, mbaret, mbrookhart, merrymercy, slyubomirsky, tmoreau89, tqchen and trevor-m as code owners September 15, 2021 09:28

masahi mentioned this pull request Jan 21, 2022

Will you provide a demo for mmdet? Thanks masahi/torchscript-to-tvm#4

Closed

zhiqwang mentioned this pull request Jan 22, 2022

请问，后续是否会支持TVM？ open-mmlab/mmdeploy#95

Closed

Merge branch 'main' into torchop

8f8dba9

t-vi requested a review from icemelon as a code owner March 8, 2022 06:46

t-vi changed the title ~~WIP/RFC: initial stab at TorchScript fallback~~ RFC: initial stab at TorchScript fallback Mar 8, 2022

make types more flexible

84aa0fd

masahi reviewed Mar 8, 2022

View reviewed changes

src/relay/backend/contrib/libtorch/libtorch_codegen.cc Outdated Show resolved Hide resolved

src/runtime/contrib/libtorch/libtorch_runtime.cc Outdated Show resolved Hide resolved

src/runtime/contrib/libtorch/libtorch_runtime.cc Show resolved Hide resolved

Easy review bits. Thank you @masahi

ad25aa6

masahi approved these changes Mar 9, 2022

View reviewed changes

masahi merged commit fe33ed6 into apache:main Mar 9, 2022

driazati mentioned this pull request Mar 9, 2022

Fix TorchScript fallback build #10556

Merged

pfk-beta pushed a commit to pfk-beta/tvm that referenced this pull request Apr 11, 2022

RFC: initial stab at TorchScript fallback (apache#7401)

49d1cd8

* initial stab at TorchScript fallback * make types more flexible * Easy review bits. Thank you @masahi

driazati mentioned this pull request Jul 14, 2022

TVM v0.9.0.rc0 Release Candidate Notes #12102

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: initial stab at TorchScript fallback #7401

RFC: initial stab at TorchScript fallback #7401

t-vi commented Feb 3, 2021 •

edited

Loading

masahi commented Feb 3, 2021

masahi commented Feb 3, 2021

t-vi commented Feb 4, 2021

comaniac commented Feb 4, 2021

t-vi commented Feb 5, 2021

PhilippvK commented May 23, 2021

t-vi commented May 31, 2021

t-vi commented Nov 18, 2021

masahi commented Jan 9, 2022

t-vi commented Jan 10, 2022

masahi commented Jan 10, 2022

masahi commented Jan 10, 2022

t-vi commented Jan 10, 2022

masahi commented Jan 11, 2022

t-vi commented Jan 21, 2022

t-vi commented Mar 8, 2022

masahi commented Mar 8, 2022

t-vi commented Mar 8, 2022

masahi commented Mar 8, 2022

t-vi commented Mar 9, 2022

sunggg commented Sep 29, 2022

t-vi commented Sep 29, 2022

sunggg commented Sep 29, 2022

shingjan commented Sep 29, 2022

RFC: initial stab at TorchScript fallback #7401

RFC: initial stab at TorchScript fallback #7401

Conversation

t-vi commented Feb 3, 2021 • edited Loading

masahi commented Feb 3, 2021

masahi commented Feb 3, 2021

t-vi commented Feb 4, 2021

comaniac commented Feb 4, 2021

t-vi commented Feb 5, 2021

PhilippvK commented May 23, 2021

t-vi commented May 31, 2021

t-vi commented Nov 18, 2021

masahi commented Jan 9, 2022

t-vi commented Jan 10, 2022

masahi commented Jan 10, 2022

masahi commented Jan 10, 2022

t-vi commented Jan 10, 2022

masahi commented Jan 11, 2022

t-vi commented Jan 21, 2022

t-vi commented Mar 8, 2022

masahi commented Mar 8, 2022

t-vi commented Mar 8, 2022

masahi commented Mar 8, 2022

t-vi commented Mar 9, 2022

sunggg commented Sep 29, 2022

t-vi commented Sep 29, 2022

sunggg commented Sep 29, 2022

shingjan commented Sep 29, 2022

t-vi commented Feb 3, 2021 •

edited

Loading