-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Inference fails if data is moved to GPU #25484
Comments
Converting the tensor to host memory before inference fixes the issue, but throughput is significantly reduced. To the point where the throughput is larger with the non-optimized model. |
I have the same issue. Are there any examples available on how to integrate intel gpu with openvino code? Appart from the |
@guillermoayllon could you share
|
Hello @Iffa-Intel, thank you for looking into the issue. The model we are using is TorchVision's VGG16: https://pytorch-org.translate.goog/vision/main/models/generated/torchvision.models.vgg16.html?_x_tr_sl=en&_x_tr_tl=es&_x_tr_hl=es&_x_tr_pto=sc One should be able to reproduce the issue by importing the model and using the default weights as indicated in the first code block above: import torch
**from torchvision import models**
import openvino as ov
import intel_extension_for_pytorch
**model = models.vgg16(weights="VGG16_Weights.DEFAULT")**
model.eval()
ov_model = ov.convert_model(model, input=[128, 3, 224, 224]) |
@guillermoayllon to get that VGG16 model (as you mentioned), intel_extension_for_pytorch is required. |
Hello @Iffa-Intel, thank you for pointing that out. The Ubuntu version above is wrong. The version that we are using is Ubuntu 22.04.4 LTS (not 20.04). Nevertheless, the throughput issue appears when converting the tensor to host memory before inference. But that is not ideal. We would like to keep the tensor in GPU memory. |
Ref. 154510 |
Hello @guillermoayllon, you can watch the current progress and issue explanation in #27725. The new feature allows creation of OpenVINO GPU Tensors directly from a pointer to a Torch GPU tensor: image = torch.rand(128, 3, 224, 224)
image = image.to(torch.device("xpu"), memory_format=torch.channels_last)
data_ptr = image.detach().data_ptr()
ov_tensor = Tensor(data_ptr, Shape(image.shape), pt_to_ov_type_map[str(image.dtype)]) The solution should be available in the next OpenVINO release. Before the PR gets merged, if you would like to stay with your current OpenVINO version, the only thing that can be done is to create the Torch tensor on CPU (instead of XPU) and then pass it to OpenVINO inference. It will still be performed on GPU, but that way you'll avoid an additional data copy. It's only a partial fix, since the data will still have to be copied from CPU back to GPU. Once the PR is merged you can test the solution using our nightly releases: |
Thank you for looking into this |
OpenVINO Version
2024.2.0
Operating System
Ubuntu 20.04 (LTS)
Device used for inference
GPU
Framework
PyTorch
Model used
VGG16
Issue description
Hiya,
My goal is to perform inference on Intel GPU with an openvino model. However, inference fails if I move the input data to GPU before performing inference.
Step-by-step reproduction
First, I convert the VGG16 model to openvino IR format in this way:
Then I compile the model:
Finally, I attempt inference in the following way:
To produce the dataloader, I use this function:
Environment:
Device Info: x2 Intel(R) Data Center GPU Max 1100
Relevant log output
Issue submission checklist
The text was updated successfully, but these errors were encountered: