-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
❓ [Question] Is it possibile to use a model optimized through TorchTensorRT in LibTorch under Windows? #856
Comments
Did you verify that your GPU is accessible in WSL as well as in a container inside WSL? |
I tried out one of our example notebooks in WSL2 in the 22.01 container. Seems like things work properly. I would make sure that your GPU is accessible from WSL |
Also are you planning to run this model in deployment inside WSL or in Windows? Iirc, there isn't necessarily compatibility across operating systems (WSL would fall under Linux). @ncomly-nvidia do you know? I think however that running in WSL should be fine as long as it fits your usecase |
This is on Windows 10: 21H2, with CUDA 11.6 installed on the system and following these instructions https://docs.nvidia.com/cuda/wsl-user-guide/index.html |
Hi @narendasan, thanks for your answer. I solved the first problem (now I have the same behaviour in both WSL and Ubuntu, which is great!) by downloading and installing the latest driver from here. But now I got another problem: I really NEED to use the optimized model in a Windows environment (and not WSL) wth LibTorch. This is the C++ script I'm using to test if the model is functioning correctly:
If I try to run this C++ script with the optimized model, the program fails on loading. Is there any way to make this work? Thanks |
You can try turning on debug logging to see if it is torch-trt's runtime failing. Also its worth trying with a non compiled torchscript module beforehand as well |
Hi @narendasan, what do you mean with "turning on debug logging"? The error, as shown in the stack trace, happens in an externel .dll ( |
You can enable torchtrt debug logging with |
Also how did you build Torch-TensorRT for windows? |
Ok thanks, I will include here just the ouptut of the tracing and optimization process through TorchTensorRT.
This is the output I have when I run the script with debug logging enabled. I didn't build TorchTensorRT for Windows, I'm using the latest PyTorch Docker container (22.01) on WSL2 by following the official instructions here. But I need to run the model through LibTorch on Windows though. |
To run the model on windows with libtorch, you need at minimum need to compile the |
Ok thanks @narendasan, following your advice I'm trying to compile the whole project on Windows 10, the idea is to build the Python package and optimize the model locally. Firstly, I was able to succesfully run the command
...
In order to solve it, it was enough to change
The solution here is to change the file
to (:+1:)
That I solved, as suggested by @yuriishutkin in issue #690 (referenced in issue #226) by substituting in
with this (:+1:):
And then changing
to (:+1):
That makes sense, since
to(:+1:):
Well, another problem, but apparently it was enough to rename
So I try again to build the Python package....
So, always in
with this (:+1:):
All I know is that the linker is trying to link against a static library |
That's really cool that you got windows compilation working! So really all you need to move forward with your specific use case is just linking/DL_OPEN |
I actually have no
Moreover, I' not sure to understand how linking against |
I suspect that the reason a compiled module is throwing an error on load is because you need the LibTorch runtime extension which add support for Torch-TensorRT compiled modules to deserialize and run. The lightest way to do this is by linking Probably what you need to do to add the cc_binary(
name = "torchtrt_runtime.dll",
srcs = [],
linkshared = True,
linkstatic = True,
deps = [
"//core/runtime:runtime",
"//core/plugins:torch_tensorrt_plugins"
],
) |
@andreabonvini did you end up solving this issue? I am facing a similar problem now. |
torch_tensorrt.dll file is actually the name of generated library as narendasan said. But for me it was not end of the story, because after I've built the torch_tensorrt library it appeared to have conflicts with installed torch library. It just crashed with exception somewhere inside the torch. I suppose it's because torch has C++ in its interface, and my compiler version differs from the compiler that was used to build torch. So, solution can be to match the version of compiler that torch is built or to build torch from sources. Alternatively, you can switch to WSL and install prebuilt torch_tensorrt package or use ready container from here: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch. |
@jonahclarsen Hi! Unfortunately not, I think I was kinda able to generate |
Okay, too bad! Hopefully we can figure it all out soon, I am highly motivated to get this into my Libtorch Windows program. |
@yuriishutkin When I tried linking my program against torch_tensorrt.dll.if.lib, I still get 'unresolved external symbol' linker errors, even just using a the Input() function that isn't in any namespaces. Are you saying that file was enough for you to successfully link your program? Were you able to use namespaces like torchscript? |
Right, I've added runtime and plugin sources into the same library. Also, I had problems with exporting symbols, because MSVC does not have option to export all symbols like GCC does. If you also use MSVC, you need to specify exported symbols manually, e.g. in export file. For me the following worked:
exports.def is in attached archive, place it near cpp/lib/BUILD . Yours can be different depending on the version of lib you are using. Just add all unresolved externals to the list. |
@yuriishutkin Okay, I went another route, by adding
Would you be willing to share your entire WORKSPACE and cpp/lib/BUILD files, or ideally even your entire project that was able to succesfully compile the .lib file? |
@jonahclarsen Sure, please take a look. https://github.com/yuriishutkin/Torch-TensorRT/tree/windows I run in py directory It builds torch_tensorrt.dll + torch_tensorrt.dll.if.lib and then links it to _C lib. The only thing, I do manually copy bazel-out\x64_windows-opt\bin\cpp\lib\torch_tensorrt.dll.if.lib to py\torch_tensorrt\lib\ because bazel do not copy this file automatically. But once again, for me resulting _C lib is not loaded successfully in python because of exception inside torch. |
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days |
❓ Question
I would need to optimize an already trained segmentation model through TorchTensorRT, the idea would be to optimize the model by running the newest PyTorch NGC docker image under WSL2, exporting the model and then loading it in a C++ application that uses LibTorch, e.g.
Would this be the right approach?
What you have already tried
At the moment I only tried to optimize the model through TorchTensorRT, and something weird happens. Here I'll show the results for the Python script below that I obtained on two different devices:
As you can see, the optimization process under WSL gives me a lot of GPU errors, while on Ubuntu it seems to work fine. Why does this happen?
My script:
Ubuntu desktop
Windows PC
Environment
newest PyTorch NGC docker image
My Windows PC mounts a RTX3080.
My Ubuntu desktop mounts a GTX1080Ti.
Additional context
The text was updated successfully, but these errors were encountered: