[TODO] Implement a process to reduce accuracy degradation due to transposition errors in the Transformer's `MatMul` input values. #317

PINTO0309 · 2023-04-16T10:06:12Z

Issue Type

Others

onnx2tf version number

1.10.x

onnx version number

1.13.1

tensorflow version number

2.12.0

Download URL for ONNX

N/A

Parameter Replacement JSON

N/A

Description

Implement a process to reduce accuracy degradation due to transposition errors in the Transformer's MatMul input values.

1. Issue

The process for automatic correction of transposition errors has already been implemented internally.
When the auto-correction feature is enabled, it attempts to retain the ONNX pre-estimation results in Numpy.ndarray format for the entire model. However, the Numpy.ndarray format is very RAM intensive, and Out of Memory frequently occurs when converting models of large size.
In addition, the output values of ONNX and TensorFlow are compared for all OPs to verify the certainty of the conversion, which has the problem of significantly slowing down the conversion speed of the model.
Frequent accuracy degradation occurs only when the sizes of all dimensions except the batch size are the same, as in [1,256,256] in the figure below.

2. Idea

Instead of keeping the output values of all OPs of ONNX, aim only at MatMul to keep the inference results, thus greatly reducing RAM consumption.
When generating a TensorFlow MatMul or BatchMatMul, always check the consistency with the output value of ONNX's MatMul, and if a large difference occurs, automatically transpose the input tensor in a brute force fashion to find the tensor arrangement with the smallest error.
When validating the ONNX model, not only the output tensor of MatMul but also the input tensor should be kept at the same time so that it can be diverted as an input tensor when validating the TensorFlow OP.
Immediately after starting the tool, dummy inference is performed, but instead of assigning output OPs to all OPs in the model, output OPs are assigned only to MatMul.
During dummy inference, only the input and output tensors of the MatMul OP should be kept internally.
Consider persistence to an external file depending on the total number of MatMul OPs and the size of each retention tensor.

3. Related issue

The text was updated successfully, but these errors were encountered:

On-JungWoan · 2023-04-17T16:24:41Z

This is a personal question that is off topic, but is there a reason to store the output of all ONNX OPs? In my opinion, it seems more efficient to compare the layer outputs of ONNX and TensorFlow sequentially and only store the current results rather than all previous ones. If this method is used, I think there won't be any memory shortage issues even with large models.

PINTO0309 · 2023-04-17T16:39:55Z

Thank you. Actually, I have a history of trying the idea you suggested already 2 months ago. At that time, all OPs other than MatMul had to be included in the verification process.

The reason for this is that my tools were always incomplete and still not finished, and had various bugs inherent in them.

In other words, the successful verification of a local tensor is predicated on the assumption that all of the OPs preceding the OP being verified are bug-free.

At the moment, I am troubled by the fact that this tool does not address all of the infinite number of model transformation patterns, so local verification alone often does not work.

I do most of the bug fixing on my own, but there are too many patterns and not enough time.

On-JungWoan · 2023-04-17T16:50:38Z

Ahh, Thank you for all your hard work. I have been following you since you created openvino2tensorflow, and I have always had great respect for you. If there is anything I can do to help, I am more than happy to lend a hand. Thank you again.

PINTO0309 added OP:MatMul OP:MatMul TODO TODO Transformer Transformer Need Help Need Help and removed Need Help Need Help labels Apr 16, 2023

On-JungWoan mentioned this issue May 1, 2023

[DN-DAB-DETR] The output of ONNX's Mul OP is different from the TFLite's output. #327

Closed

PINTO0309 mentioned this issue May 3, 2023

Modified to continue the matching process when the product of dimensions of the output geometry does not match between ONNX and TF. #336

Merged

PINTO0309 closed this as completed May 3, 2023

This was referenced May 4, 2023

Speeds up Softmax tensor correction process #337

Merged

Implementation of strict mode #145

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TODO] Implement a process to reduce accuracy degradation due to transposition errors in the Transformer's `MatMul` input values. #317

[TODO] Implement a process to reduce accuracy degradation due to transposition errors in the Transformer's `MatMul` input values. #317

PINTO0309 commented Apr 16, 2023 •

edited

Loading

On-JungWoan commented Apr 17, 2023

PINTO0309 commented Apr 17, 2023

On-JungWoan commented Apr 17, 2023

[TODO] Implement a process to reduce accuracy degradation due to transposition errors in the Transformer's MatMul input values. #317

[TODO] Implement a process to reduce accuracy degradation due to transposition errors in the Transformer's MatMul input values. #317

Comments

PINTO0309 commented Apr 16, 2023 • edited Loading

Issue Type

onnx2tf version number

onnx version number

tensorflow version number

Download URL for ONNX

Parameter Replacement JSON

Description

1. Issue

2. Idea

3. Related issue

On-JungWoan commented Apr 17, 2023

PINTO0309 commented Apr 17, 2023

On-JungWoan commented Apr 17, 2023

[TODO] Implement a process to reduce accuracy degradation due to transposition errors in the Transformer's `MatMul` input values. #317

[TODO] Implement a process to reduce accuracy degradation due to transposition errors in the Transformer's `MatMul` input values. #317

PINTO0309 commented Apr 16, 2023 •

edited

Loading