-
Notifications
You must be signed in to change notification settings - Fork 350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix build #1479
Fix build #1479
Conversation
@narendasan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @yinghai for your time on deep diving in this issue. It helps us a lot on unblocking the 1.3 release!
":use_pre_cxx11_abi": ["@libtorch_pre_cxx11_abi//:libtorch"], | ||
"//conditions:default": ["@libtorch//:libtorch"], | ||
":use_pre_cxx11_abi": ["@libtorch_pre_cxx11_abi//:libtorch", "@libtorch_pre_cxx11_abi//:c10_cuda"], | ||
"//conditions:default": ["@libtorch//:libtorch", "@libtorch//:c10_cuda"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you forget to mention in the diagnosis that we also need to link with c10_cuda? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't need that. But it doesn't hurt.
How to debug
python3 -c "import torch_tensorrt; torch_tensorrt.dump_build_info()"
and see that issue can be reproduced.LD_DEBUG=libs python3 -c "import torch_tensorrt; torch_tensorrt.dump_build_info()"
and locate the torch libraries at/opt/circleci/.pyenv/versions/3.9.4/lib/python3.9/site-packages/torch/lib/libc10_cuda.so
.nm -C /opt/circleci/.pyenv/versions/3.9.4/lib/python3.9/site-packages/torch/lib/libc10_cuda.so | grep CUDACachingAllocator |grep T
and noticed that it seems weird as it doesn't contain the symbolc10::cuda::CUDACachingAllocator::allocator
. It seems that the libs are not from torch 1.14 nightly.python3 -c "import torch; print(torch.__version__)"
and it showed1.13.0+cu117
. This is very weird as we are supposed to install1.14.0+cu116
.Install torch-tensorrt
stage, we are actually uninstalling the previously installed torch 1.14py/setup.py
and find that we are forcing torch version to be "torch>=1.13.0.dev0,<1.14.0". Fix that.Type of change
Please delete options that are not relevant and/or add your own.
Checklist: