-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concat argument element types are inconsistent #4745
Comments
@mvafin, please, take a look. |
@Iffa-Meah @lazarevevgeny @mvafin I have created a minimal PyTorch/ONNX model that will allow you to reproduce the bug without relying on my more complex custom model. Please see the attached ONNX file (or use the script below to generate it) import numpy as np
import onnx
import onnxruntime
import torch
import torch.onnx
import torch.nn as nn
def to_numpy(tensor):
"""Helper function recommended in official PyTorch ONNX tutorial
See details below
https://pytorch.org/tutorials/advanced/super_resolution_with_onnxruntime.html
"""
return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()
class ConcatModel(nn.Module):
def __init__(self, vocab_size=10, seq_len=3):
"""Setup random character probabilities
Do this only once in the init so that calling the model multiple
times (whether from PyTorch or ONNX) will produce identical results.
:param int vocab_size: Number of unique character possibilities
:param int seq_len: Maximum number of characters in the sequence
"""
super(ConcatModel, self).__init__()
self.seq_len = seq_len
self.char_probs = [torch.randn(vocab_size) for _ in range(seq_len)]
def forward(self, sos_token):
"""Simplified generation for sequence of character IDs
:param torch.Tensor sos_token: Index representing start-of-sequence
token. Must be a tensor because JIT trace for the ONNX export
doesn't support int inputs. Expected shape ``(1, )``
:return torch.LongTensor seq: Complete sequence of character indices
"""
seq = sos_token
for i in range(self.seq_len):
new_idx = torch.argmax(self.char_probs[i]).unsqueeze(0)
seq = torch.cat([seq, new_idx], dim=0)
return seq
if __name__ == '__main__':
# Set constants
torch.manual_seed(1234)
sos_token = torch.tensor([-1])
model_path = 'concat_model.onnx'
# Run PyTorch inference and export to ONNX
model = ConcatModel()
model.eval()
torch_out = model(sos_token)
torch.onnx.export(
model=model,
args=sos_token,
f=model_path,
opset_version=9,
do_constant_folding=True,
input_names=['sos_token'],
output_names=['seq'])
# Load/check the ONNX model and run through Python API
onnx_model = onnx.load(model_path)
onnx.checker.check_model(onnx_model)
ort_session = onnxruntime.InferenceSession(model_path)
ort_inputs = {ort_session.get_inputs()[0].name: to_numpy(sos_token)}
ort_outs = ort_session.run(None, ort_inputs)
np.testing.assert_allclose(to_numpy(torch_out), ort_outs[0], rtol=1e-03, atol=1e-05) After download/producing the ONNX file, convert it to OpenVINO
And reproduce the error when trying to load into the inference engine from openvino.inference_engine import IECore, IENetwork
ie = IECore()
net = ie.read_network(model='concat_model.xml', weights='concat_model.bin') The error traceback is very similar to my custom model
One of the two concat inputs is int32 while the other is int64. In the ONNX graph, everything is int64 throughout but in the XML
With this modification, the inference engine is able to load the resulting XML without any error (as desired). However, I still believe this is a bug because the ONNX model explicitly defines |
I believe this "Argument element types are inconsistent" Error happens because you didn't specifically specify your scale values and means values especially during the ONNX to IR conversion You may take this as example. A simple import and the implementation of Executable Network object also works for his converted model. |
My full model is designed to include the image scaling and mean values inside the ONNX graph, so adding those as CLI flags would duplicate the transformation. I tested your suggestion anyways, and it does not make a difference for my full model - the error is still the same. In my full model, the concat nodes occur many operations into the graph, so that is why they are not affected by providing the mean/scale values options. We need to understand why the unsqueeze node (which is explicitly int64 in the ONNX graph) gets converted to int32 by OpenVINO. My guess is the model optimizer code has a hard-coded preference for int32 somewhere which is overriding the int64 definition provided by ONNX |
To further troubleshoot my full model, I added an explicit cast layer in the ONNX graph prior to the troublesome concat I expected this to resolve any uncertainty about the incoming concat dtype, however OpenVINO's model optimizer removes my cast node and continues to leave the concat inputs' dtypes as The error from the inference engine is still the same (I believe the nodes names only changed because of my ONNX modifications to add a cast node)
Can anyone explain why my cast operation is being ignored? Regardless of what I try, OpenVINO is not respecting the dtypes that are specified (very explicitly) in my ONNX graph |
@addisonklinke , not all OpenVINO plugins natively support int64 so when we have operations producing int64 values on a data path (not ShapeOf sub-graphs which are const-folded before passing the model to the plugin) MO converts them to int32. This is why the explicit cast to int64 is removed from the model. But when you explicitly specified the input type using "--input sos_token[1]{i64}" parameter you override the default behaviour of the MO and the IR is generated with Parameter of type int64. However, this IR will not work for some OpenVINO plugins because they don't support int64 natively. |
Thank you for that clarification. Is there a way I can override the default behavior for non-input nodes and generate an IR with type int64 (even if it involves modifying the MO source code)? The target hardware for this model in Movidius VPU - if int64 plugins are not supported by that device would the inference engine be able to fallback to CPU execution for certain nodes? In terms of precision, I have no problem running all the model nodes on int32. The issue is that PyTorch's |
Looks like you need to comment the following lines: https://github.com/openvinotoolkit/openvino/blob/master/model-optimizer/extensions/front/ChangePlaceholderTypes.py#L52-L55 Unfortunately, I cannot answer whether the fallback to CPU will work in this case. |
Shortly after my comment yesterday, PyTorch clarified that the int32 issue has been resolved on their end in the latest 1.8.0 release. By upgrading PyTorch and tweaking my model, I was able to produce an IR that is loaded by the OpenVino inference engine without any issue on CPU. After seeing this issue play out, I believe the previous error when using int64 should be raised by the model optimizer rather than the inference engine. From my perspective, there is no point in producing an IR that is known to fail with the inference engine, so raising the error earlier in the process would be more intuitive for end users. Of course, maybe the OpenVino developers disagree with me, but this is my suggestion 🙂 As a final step, I am working to load and run the full model on Movidius VPU, but the same
If I add |
@addisonklinke , I agree that the MO should not generate the IR in case of types mismatch, but for now we cannot do this unfortunately. @taka-no-me , @gladilov-gleb , could you take a look at the VPU issue? The CPU works fine for the model. |
@taka-no-me @gladilov-gleb I have created a minimal reproducible example for the VPU optimization error. It seems to occur whenever the graph uses a constant to initialize a sequence of operations. Please use my Python script below to produce two ONNX variants: import torch
import torch.onnx
import torch.nn as nn
class Simple(nn.Module):
def __init__(self, img_size):
super(Simple, self).__init__()
self.img_size = img_size
self.conv = nn.Conv2d(3, 16, kernel_size=3)
def forward(self, a, b=None):
if b is None:
b = torch.ones(1, 3, self.img_size, self.img_size)
feat_b = self.conv(b)
feat_a = self.conv(a)
return feat_a + feat_b
if __name__ == '__main__':
img_size = 24
model = Simple(img_size)
model.eval()
template = torch.randn(1, 3, img_size, img_size)
implicit_b = template
explicit_b = (template, template)
torch.onnx.export(model, implicit_b, 'implicit.onnx')
torch.onnx.export(model, explicit_b, 'explicit.onnx') Then convert both to the OpenVino IR with from argparse import ArgumentParser
from openvino.inference_engine import IECore
parser = ArgumentParser(description='Test VPU inference with OpenVino IR files')
parser.add_argument('-m', '--model', type=str, required=True, help='common path prefix of IR files')
args = parser.parse_args()
print(f'Loading {args.model}')
model_xml = args.model + '.xml'
model_bin = args.model + '.bin'
ie = IECore()
net = ie.read_network(model=model_xml, weights=model_bin, init_from_buffer=False)
print('Read network')
exec_net = ie.load_network(network=net, num_requests=1, device_name='MYRIAD')
print('Loaded network to VPU') You should see $ python3 test.py -m explicit
Loading explicit
Read network
Loaded network to VPU
$ python3 test.py -m implicit
Loading implicit
Read network
Traceback (most recent call last):
File "minimal.py", line 15, in <module>
exec_net = ie.load_network(network=net, num_requests=1, device_name='MYRIAD')
File "ie_api.pyx", line 306, in openvino.inference_engine.ie_api.IECore.load_network
File "ie_api.pyx", line 315, in openvino.inference_engine.ie_api.IECore.load_network
RuntimeError: duplicateData error: while duplicating Constant_0/Output_0/Data__const
Const data got different desc and content byte sizes (4608 and 3456 respectively) |
I also tried registering a PyTorch buffer as the default value for class Simple(nn.Module):
def __init__(self, img_size):
super(Simple, self).__init__()
self.conv = nn.Conv2d(3, 16, kernel_size=3)
self.register_buffer('default_b', torch.ones(1, 3, img_size, img_size))
def forward(self, a, b=None):
if b is None:
b = self.default_b
feat_b = self.conv(b)
feat_a = self.conv(a)
return feat_a + feat_b |
@Maxim-Doronin please take a look |
@Maxim-Doronin any updates on this? |
@addisonklinke, we've added this issue into our sprint. We will look at this as soon as possible Ref. 51088 |
@Maxim-Doronin Do we have any progress for this issue? |
@Maxim-Doronin I see that OpenVino has two new releases since the start of this issue
Do either of them have the bug fix from the sprint you mentioned which would address my VPU issue? If so, I can test and close this issue if everything looks good EDIT: tested in an Ubuntu 18.04 Docker container with model optimizer 2021.4.582 and inference engine 2021.4.0-3839-cd81789d294 (from PyPI), and still getting the same error
|
Apologies for the delay in our response, the proper fix is not included in these two releases. It is currently planned for a future release but I cannot comment on the timing. As a workaround, you will need to specify the input as you mentioned above. Regards, |
The bug has been fixed by this PR: #7630 |
System information (version)
Detailed description
The model optimizer successfully converts my ONNX model to the OpenVINO IR representation (.xml and .bin) - there are no errors or warnings. However, the Python binding for the inference engine fails to even load the network. Since the IR files are produced without issue, I expect the network to load without error at a minimum.
Steps to reproduce
UPDATE March 16: I added code in a later comment to generate a very basic (5 node) ONNX graph that will reproduce the error
python3 mo.py --input_model model.onnx --progress
The last line fails with
The traceback indicates that the two input nodes of
Concat_470
have inconsistent dtypes int32 and int64. However, if I openmodel.xml
with Netron, this is not true. The output of both nodes469/Unsqueeze
and413[0]
(shown below) are consistently int64 as expected. The dtype agreement is also correct in the original ONNX graph. It appears that only the OpenVINO inference engine is somehow misinterpreting the dtypesThe only thing odd in the XML graph is that the concat node causing the error lists its input types as
?
Note input
43:2
comes from413/Multiply[0]
and input574:2
comes from469/Unsqueeze
This leads me to believe that the inference engine is inferring the int32 dtype that causes the error rather than knowing for sure. If that is the root issue, I am unclear why...
?
) dtypes instead of raising on error on the conversion from ONNXIssue submission checklist
The text was updated successfully, but these errors were encountered: