Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check 'onnxruntime-gpu' if torch.has_cuda #5087

Merged
merged 3 commits into from
Oct 13, 2021
Merged

Conversation

glenn-jocher
Copy link
Member

@glenn-jocher glenn-jocher commented Oct 7, 2021

Possible fix for #4808 'ONNX Inference Speed extremely slow compare to .pt Model'

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Improved ONNX runtime support for GPU in detect.py.

📊 Key Changes

  • Conditional dependency check for onnxruntime-gpu or onnxruntime based on CUDA availability.

🎯 Purpose & Impact

  • Purpose: Ensures that the appropriate ONNX runtime is used depending on whether the user has a CUDA-enabled GPU or not. This optimizes performance by leveraging GPU acceleration when available.
  • Impact: Users with CUDA GPUs will experience faster inference times due to the GPU-optimized runtime, while users without CUDA GPUs will continue using the standard runtime without additional setup. This change enhances user experience by automating the selection of the most suitable ONNX runtime.

@callbarian
Copy link

callbarian commented Oct 12, 2021

@glenn-jocher
if you install onnxruntime-gpu while onnxruntime already installed, the code might be using onnxruntime which is cpu.
Try removing onnxruntime, and installing onnxruntime-gpu
In other words, the two package should not be installed at the same time.

@glenn-jocher
Copy link
Member Author

@callbarian thanks for the pointer! The current check_requirements() will only uninstall existing packages of the same name I believe, though I'm not sure. I guess ideally onnxruntime-gpu would uninstall onnxruntime on install, though this would have to be a feature/issue for the ONNX repo.

I tested this in Colab which had neither installed by default. The -gpu install works, but inference does not seem to use the GPU still as speeds are not improved over onnxruntime install.

@callbarian
Copy link

callbarian commented Oct 13, 2021

@glenn-jocher I am not sure about Colab environment. Any possibility that colab already has onnxruntime by default?

Usually, if onnxruntime-gpu does not find cuda or cudnn, it will complain the missing .so files. I don't think it automatically runs with cpu mode.

Here's what I did.
I ran on rtx 3090, cuda 11.2, cudnn 8.2.1( installed with conda-forge, because 'conda install' and 'conda install -c conda-forge' will get different cudnn versions)

both cuda and cudnn were installed through conda

Python detect.py --weights last.onnx --source images/

the inference time was around 0.04s ~ 0.09s per image.
The cpu was around 0.17s

It might be possible that colab gpu is not that fast enough.

One way to check it is check the memory usage with nvidia-smi on onnxruntime and onnxruntime-gpu respectively.
I have checked that onnxruntime-gpu uses much more memory during runtime.

Hope it helps!

@glenn-jocher
Copy link
Member Author

@callbarian thanks! Perhaps it's just a driver issue like you mentioned. In any case I'll go ahead and merge this for now until we find a better solution. /rebase

@glenn-jocher glenn-jocher merged commit b754525 into master Oct 13, 2021
@glenn-jocher glenn-jocher deleted the detect/onnxruntime-gpu branch October 13, 2021 05:25
@glenn-jocher
Copy link
Member Author

@callbarian PR is merged. Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐

@glenn-jocher glenn-jocher self-assigned this Oct 13, 2021
BjarneKuehl pushed a commit to fhkiel-mlaip/yolov5 that referenced this pull request Aug 26, 2022
* Check `'onnxruntime-gpu' if torch.has_cuda`

* fix indent
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ONNX Inference Speed extremely slow compare to .pt Model
2 participants