Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading IR: Primitive descriptor was not found for node #157

Closed
el1995 opened this issue May 20, 2019 · 19 comments
Closed

Loading IR: Primitive descriptor was not found for node #157

el1995 opened this issue May 20, 2019 · 19 comments

Comments

@el1995
Copy link

el1995 commented May 20, 2019

Hello everyone,

I have converted an own Tensorflow frozen graph to the intermediate representation (.xml and .bin). Unfortunately, my code is unable to use it as an executable network, which means it fails at this command:

ExecutableNetwork executable_network = plugin.LoadNetwork(network, {});

The error message reads:

terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException'
what(): Primitive descriptor was not found for node dense_1/MatMul.

I attached a zip containing my .xml, my .bin and my .pb file, I would be very thankful to receive support as I have not been able to fix this issue for several days. Thanks!

PrimitiveDescriptorError.zip

@shubha-ramani
Copy link

Dear @el1995

There is an issue on MatMul. Sorry about the inconvenience. Can you kindly check out the following github issue 134 ? I reproduced that customer's bug and filed a bug ticket.

Thanks !

Shubha

@el1995
Copy link
Author

el1995 commented May 21, 2019

Hello Shubha,

so I just recognize that you are also the Intel expert here, we already discuss my issue here:
https://software.intel.com/en-us/forums/computer-vision/topic/809542#comment-1939213
So I suggest we continue here?

I checked github issue 134, however I do get totally different error messages. But obviously both problems refer to MatMul issues. As we strongly depend on OpenVINO I have two further questions for now:

  • Can you still reproduce my bug to make sure it is not my fault?
  • Assuming that the Tensorflow MatMul causes the issue - is it useful to convert our h5-keras-file into an onnx-format instead of frozen graph and do the model optimization with the onnx-file? Or do you expect the same issue for that procedure?

Best regards and thank you,

Elias

Edit: I am able to load the intermediate representation of Intel's alexnet_fp32.xml for classification. It also contains FullyConnected-layers, and if I got it right the model optimizer converts Tensorflow's MatMul to FullyConnected. So looks like the conversion is the problem?

I tried the same with two other models, exactly the same error for all the three. Please find attached zip-files containing all the three models, each with their .pb, .xml, .bin and .mapping. For model 3 .pb is missing because the model size was too large.

Error for model 1:
terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException'
what(): Primitive descriptor was not found for node dense_1/MatMul.

Error for model 2:
terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException'
what(): Primitive descriptor was not found for node dense_1/MatMul.

Error for model 3:
terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException'
what(): Primitive descriptor was not found for node dense_1_7/MatMul.

Model1.zip
Model2.zip
Model3_without_pb.zip

Moreover I point out that we generate our frozen graphs from keras .h5-format. I also attach a typical .h5-file that we use.
FPV.h5.zip

@shubha-ramani
Copy link

Dear @el1995
Absolutely. I will be happy to reproduce your problem - thank you for your million attached zip files (ha ha). Now I have to ask - can I just run the classification sample for inference ? You mentioned classification above so I am asking.

Thanks for your patience. I will post my findings here.

Sincerely,

Shubha

@el1995
Copy link
Author

el1995 commented May 22, 2019

Hello Shubha,

I hope they help, I tried plenty of them so I thought it'd be useful to provide them ;-)

So the task I want to do later on has nothing to do with classification, we are not even in the field of image recognition. I want to run a residual network with 3 input sizes (floats) to get 15 output values (also floats), we use it within a fluid dynamics simulation. I only mentioned it because I recognized that alexnet_fp32.xml has FullyConnected layers, and to my understanding MatMuls from a frozen graph should be converted to this. However generating an executable network worked here (the code I used can be found in ~/intel/openvino/inference_engine/samples/hello_autoresize_classification).

Another example I tried (I may have mentioned): I found a frozen graph called "googlenet-v3.frozen.pb" here:
~/intel/openvino/deployment_tools/tools/model_downloader/classification/googlenet/v3/tf
It's also a classification example but thats coincidence, it was just to try the conversion. Here i was able to do the conversion to IR (.xml, .bin) and also to load the IR into the inference engine. So that means I am able to get from frozen graph to an executable network via IR, but: "googlenet-v3.frozen.pb" does not seem to have MatMul-layers. So this once again points to a problem with MatMul, right?

I will further work on the problem and keep you updated.

Greets,

Elias

@shubha-ramani
Copy link

shubha-ramani commented May 28, 2019

Dear @el1995 ,
I started with your keras model itself, found in your FPV.h5 . The reason I did that is because starting from the root model if I can reproduce your error, then that could be a real bug. To do this, I first must convert the keras model to a frozen pb. To do that I used logic similar to the following:

def export_keras_to_tf(input_model, output_model, num_output):
    print('Loading Keras model: ', input_model)

    keras_model = load_model(input_model)

    print(keras_model.summary())

    predictions = [None] * num_output
    predrediction_node_names = [None] * num_output

    for i in range(num_output):
        predrediction_node_names[i] = 'output_node' + str(i)
        predictions[i] = tf.identity(keras_model.outputs[i], name=predrediction_node_names[i])

    sess = K.get_session()

    constant_graph = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), predrediction_node_names)
    infer_graph = graph_util.remove_training_nodes(constant_graph) 

    graph_io.write_graph(infer_graph, '.', output_model, as_text=False)

When I do this, I get the following exception, which makes me suspect that your model has an issue.

File "C:\Users\sdramani\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\utils\generic_utils.py", line 165, in deserialize_keras_object
':' + function_name)
ValueError: Unknown metric function:coeff_r2

I didn't try your other models because until I can convert a keras model to a frozen pb first, it's pointless to do so. I am using the latest version of Keras which is 2.2.4.

I am wondering how you converted this model to a frozen pb. You must have succeeded because you were able to generate IR.

Looking forward to hearing your response. Thanks !

Shubha

@el1995
Copy link
Author

el1995 commented May 29, 2019

Hi Shubha,

thanks for your reply. I am sorry, I forgot to mention our custom coeff_r2-funktion. I added our python script (zipped as github did not accept .py) for conversion purposes which was generated by a colleauge. The conversion is done using the command line:
python3 k2tf.py --input_model=inputname.H5 --output_model=outputname.pb

Please let me know if there is something additional I should provide.
k2tf.zip
currentH5file.zip

Best regards,

Elias

Edit: as I just generated a new h5 file I decided to also add it so that you have a comparison, but both files (old and this one) should be nearly identical.

@shubha-ramani
Copy link

shubha-ramani commented May 31, 2019

Dear @el1995
Thank you ! I will try your newly zipped and attached packages. *.zip files are best actually rather than individual files.

Thanks,
Shubha

@shubha-ramani
Copy link

shubha-ramani commented Jun 5, 2019

Dear @el1995,

Thanks for your patience and I'm really truly sorry that it's taken me so long to get back to you. I was finally able to run your model and infer (using classfication_sample.exe). With even later than 2019R1.1 I am getting an error but it has nothing whatsoever to do with MatMul. The issue is that to build IR I'm using the following command:

python "c:\Program Files (x86)\IntelSWTools\openvino_2019.2.162\deployment_tools\model_optimizer\mo_tf.py" --input_model frozen_pb --input_shape [1,3]

Note the input has only 2 things, batch size and number of channels. What are the proper height and width required for your model input ? Visualizing your Keras model in Netron doesn't give me clues either. The IR does get converted successfully but I get an error when I run classification_sample.exe, an error in fact which makes perfect sense:

C:\Users\sdramani\Documents\Intel\OpenVINO\inference_engine_samples_build\intel64\Release>classification_sample.exe -i c:\users\sdramani\Downloads\pics\horse.bmp -m c:\users\sdramani\Downloads\github\frozen_pb.xml
[ INFO ] InferenceEngine:
API version ............ 1.6
Build .................. 24976
Description ....... API
[ INFO ] Parsing input parameters
[ INFO ] Files were added: 1
[ INFO ] c:\users\sdramani\Downloads\pics\horse.bmp
[ INFO ] Creating Inference Engine
CPU
MKLDNNPlugin version ......... 1.6
Build ........... 24976

[ INFO ] Loading network files:
c:\users\sdramani\Downloads\github\frozen_pb.xml
c:\users\sdramani\Downloads\github\frozen_pb.bin
[ INFO ] Preparing input blobs
[ ERROR ] Size of dims(2) and format(NCHW) are inconsistent.

What the error is telling you is that the --input_shape in the IR is just [1,3] which is not the layer shape which Inference Engine expects (NCHW). There is no way for Inference Engine to guess H or W, you must provide it somehow. I tried guessing a few HxW based on the model image in Netron but nothing seemed to work, Model Optimizer complained that the --input_shape passed in was invalid.

Anyway. I hope it helps.

Thanks for using OpenVino !

Shubha

@el1995
Copy link
Author

el1995 commented Jun 6, 2019

Hello Shubha,

no problem, thanks for your support.

As mentioned above I do not want to do image recognition, instead we use it for fluid dynamics. So, assuming for now that the batch size is 1, we have an input tensor of shape [1,3] and an output tensor of shape [1,15]. The 3 sizes passed to the network are physical sizes (e.g. the so-called progress variable that describes combustion processes), the sizes taken from the inference are also physical sizes (e.g. density of the fluid, temperature, concentration of CO2,...). So for now I just want to get the code running with [1,3] as input shape and [1,15] as output shape. However later on we will drastically increase the batch size.

I am not getting your error due to the fact that I set the format to HW instead of NCHW. To my understanding that is no problem, also the inference engine takes it without any complaints. However in line 75 of the file attached it crashes, during

ExecutableNetwork executable_network = plugin.LoadNetwork(network, {});

This is where I get the error I am talking about:

terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException'
what(): Primitive descriptor was not found for node dense_1/MatMul.

It would be great if you would try to do the inference with the code (and the CMake-File) I attached and see whether you can get rid of the "primitive descriptor not found" bug.
combustionElias.zip

Another notice: as you will see I do not provide an input image and the hardware type (GPU, CPU,...) when I try to run my code. The reasons therefore are:

  1. In my application later on I will not use an image as explained above; rather I will feed an input vector to OpenVINO from my fluid dynamics software.
  2. I always use CPU so I set CPU as default in line 32

To run my code, I use the following command:
./combustionElias /path/to/model/tf_model_FPV.xml

Greets,
Elias

@el1995
Copy link
Author

el1995 commented Jun 14, 2019

Hi Shubha, have you already been able to reproduce my issue?

@shubha-ramani
Copy link

Dear @el1995 I ported your code over yesterday. Hopefully today I will have something for you. Thanks for your patience.

Shubha

@shubha-ramani
Copy link

shubha-ramani commented Jun 17, 2019

Dear @el1995
Thanks for your patience. I finally reproduced your issue and narrowed it down to the following code
return std::make_shared(network, conf, extensionManager);
in mkldnn_plugin.cpp

I think it's a bug and I don't know why this should happen. I will file a bug on your behalf. It seems like someone else on the forum had a IDZ Forum Issue similar to yours.

I do believe it has to do with MatMul, as I told you earlier.

Thanks,

Shubha

@el1995
Copy link
Author

el1995 commented Jun 18, 2019

Hello Shubha,

great, thank you very much. Good to know you already found the underlying issue.

As mentioned in my first response in this thread the issue in the Intel forum was also reported by myself, it is the identical issue:

so I just recognize that you are also the Intel expert here, we already discuss my issue here:
https://software.intel.com/en-us/forums/computer-vision/topic/809542#comment-1939213
So I suggest we continue here?

I hope for a quick fix as we would be very excited to finally use OpenVINO within our project.

Best regards and thank you very much,

Elias

@shubha-ramani
Copy link

Dear @el1995
I filed a bug, and certainly when I do that OpenVino developers act on it and take it very seriously.
Thanks for your patience and I'm sorry that it took so long !

Shubha

@shubha-ramani
Copy link

Dear @el1995
Can you comment out this line in your main.cpp, recompile and try again ? network_reader.getNetwork().setBatchSize(batchSize); It should be line 45 in the file you gave me.

Thanks !

Shubha

@el1995
Copy link
Author

el1995 commented Jun 27, 2019

Hello Shubha,

error message remains unchanged:

terminate called after throwing an instance of 'InferenceEngine::details::InferenceEngineException'
  what():  Primitive descriptor was not found for node dense_1/MatMul.

@shubha-ramani
Copy link

Dear @el1995 ,
Yes you are correct. I'm working with the developers to get it fixed.
Thanks for your patience !

Shubha

@shubha-ramani
Copy link

Dear @el1995

It looks like a coding bug. Please make the following change to your main.cpp
//input_info->setLayout(Layout::HW); // http://docs.openvinotoolkit.org/2019_R1.01/ie__common_8h.html#a246d143abc5ca07da8d2cadeeb88fdb8
input_info->setLayout(Layout::NC);

When I tested your code, this worked for me !

Shubha

@el1995
Copy link
Author

el1995 commented Jul 1, 2019

Worked perfectly, thank you very much!

I close this and post a link to this thread to the second post in you Developer Zone:
https://software.intel.com/en-us/forums/computer-vision/topic/809542

@el1995 el1995 closed this as completed Jul 1, 2019
redradist pushed a commit to redradist/openvino that referenced this issue Oct 6, 2023
rengolin pushed a commit to rengolin/openvino that referenced this issue Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants