-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
torchvision.ops.batched_nms() crashes with pytorch 1.9.0 and torchvision 0.10.0 #4071
Comments
Can also report a REGRESSION.
|
Thanks for the reports. We are looking into this |
For ref I'm unable to reproduce on OSX with
|
FYI I've also tried with
on a GPU machine and it worked fine.
I'm now trying on conda with the same environment |
Have the same issue, installed with conda, also a GPU machine. |
On a Linux GPU machine it looks like torchvision 0.2.2 gets installed. I tried both with cuda 10.2 and 11.1 and both fail with
|
@NicolasHug regarding, 0.2.2, yesterday I also observed that sometimes conda only found this version, when uninstalling torchvision 0.10.0 and reinstalling it, but I am unable to recreate this at this moment. |
Looking at https://anaconda.org/pytorch/torchvision/files, it seems that the py39_cu102 and py39_cu111 are available, so I'm not sure why it's not being found. @malfet @seemethere there are problems with torchvision CUDA binaries on Linux for Python 3.9 (details in #4071 (comment)). And I've just tried with Python 3.8, and even though I'm able to install matching versions, I get the same issue as originally reported in #4071 (comment) In https://anaconda.org/pytorch/torchvision/files, the dates for torchvision binaries dates from 14 days ago, are we sure we copied the new ones that have been regenerated? Looking at the torchvision RCs in https://anaconda.org/pytorch-test/torchvision/files, they have been generated yesterday, so maybe we copied the wrong files when promoting the binaries? |
Hmm, sample code fails for me with
This one works as expected:
|
@malfet you are right, should have mentioned that I just created these tensors to satisfy the inputs without caring for actual correct input. it stills fails with the above-mentioned error on my machine. I updated the sample code above however accordingly. |
Since @fmassa pointed to https://anaconda.org/pytorch-test/torchvision/files, I just installed from there and the sample works. |
torchvision in https://anaconda.org/pytorch channel was build against 9d5561b whereas one in https://anaconda.org/pytorch was build against ae9963f |
@malfet yes, I've tested by installing torchvision from the @malfet note that there are no functional differences in the torchvision code in 9d5561b vs ae9963f, just that the PyTorch versions in between when the RC was cut has changed |
I just checked the packages on PyTorch channel, and they are up-to-date now and the code is working. Am I allowed to close this issue then? |
@egonuel Could you please detail the command that you run and that's now working? When I run |
@NicolasHug mmhh, this seems to be a different issue. When I run the line you posted on my machine, everything is fine and 0.10.0 is being installed |
I think difference can be explained by presence/absence of
|
with the just released pytorch 1.9.0 and torchvision 0.10.0 torchvision.ops.batched_nms() crashes on my machine with the following error: RuntimeError: Couldn't load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, or if you had errors while compiling torchvision from source. For further information on the compatible versions, check https://github.com/pytorch/vision#installation for the compatibility matrix. Please check your PyTorch version with torch.version and your torchvision version with torchvision.version and verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install. how to solve?please |
@ChouCHou-y this issue should have been fixed in #4240 (comment) , can you try uninstalling torchvision and installing it again? |
🐛 Bug
with the just released pytorch 1.9.0 and torchvision 0.10.0
torchvision.ops.batched_nms()
crashes on my machine with the following error:Since both are of the current version, I guess they should be compatible (they are not yet listed in the compatibility matrix).
To Reproduce
Steps to reproduce the behavior:
this example code shows the behavior on my machine:
Expected behavior
This should not result in an error.
Environment
Collecting environment information...
PyTorch version: 1.9.0
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.27
Python version: 3.9 (64-bit runtime)
Python platform: Linux-4.15.0-144-generic-x86_64-with-glibc2.27
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: Tesla V100-SXM2-32GB
GPU 1: Tesla V100-SXM2-32GB
Nvidia driver version: 460.32.03
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.20.2
[pip3] torch==1.9.0
[pip3] torchaudio==0.9.0a0+33b2469
[pip3] torchvision==0.10.0
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.2.89 h8f6ccaa_8 conda-forge
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.2.0 h726a3e6_389 conda-forge
[conda] mkl-service 2.4.0 py39h3811e60_0 conda-forge
[conda] mkl_fft 1.3.0 py39h42c9631_2
[conda] mkl_random 1.2.2 py39hde0f152_0 conda-forge
[conda] numpy 1.20.2 py39h2d18471_0
[conda] numpy-base 1.20.2 py39hfae3a4d_0
[conda] pytorch 1.9.0 py3.9_cuda10.2_cudnn7.6.5_0 pytorch
[conda] torchaudio 0.9.0 py39 pytorch
[conda] torchvision 0.10.0 py39_cu102 pytorch
Additional context
The text was updated successfully, but these errors were encountered: