Slicing model into parts and separately quantizing each of them. #520

stupidcucumber · 2023-09-28T11:33:36Z

Issue Type

Others

OS

Linux

onnx2tf version number

1.14.5

onnx version number

1.14.0

onnxruntime version number

1.16.0

onnxsim (onnx_simplifier) version number

0.4.33

tensorflow version number

2.13.0

Download URL for ONNX

https://drive.google.com/drive/folders/1A1S9_dcW-ZQ5ggnZh4noLan-TqflrhXA?usp=share_link

Parameter Replacement JSON

{}

Description

Could you please clarify how exactly I need to build representative dataset for quantization with non-image data?

As an example, I have a tensor containing all features extracted from of the already int8 quantized template branch (a.k.a 'feature_extractor_z') with the following specifications: tensor(shape=[1, 8, 8, 96], dtype=int8). Now I can dequantize it by using that metadata in the output (zero_point and scale), but is it necessary? On the other hand, I can use those int8 features as is and build whole calibration dataset out of them only.

But as for the last sentence, I have to provide proper MEAN and STD to the -cind option. I successfully quantized features_z with [[[0.485, 0.456, 0.406]]], [[[0.229, 0.224, 0.225]]] arguments, but with non-image is the whole new story! I tried to pass [0] and [1], because I don't need any scaling of the data and I only need to pass outputs from the quantized features_extractor_z to features_z inputs in the RPN as if they were one quantized part. However, it didn't work out:(

Sorry to bother you, but this problem seems to be harder than I thought and I have already killed two days trying to understand this specific part.

The text was updated successfully, but these errors were encountered:

PINTO0309 · 2023-09-28T15:19:28Z

Have you read and tried the following? Frankly, there is little more advice I can give you. The scale and mean values of the original quantized model do not matter. If you know the art of inferring the type of input data from the structure of the model alone, please let me know. What is a template branch?

https://github.com/PINTO0309/onnx2tf#9-int8-quantization-of-models-with-multiple-inputs-requiring-non-image-data

I can't read at all from the structure of the model alone what the input data for the onnx that you shared means. If you need to normalize input data, do it; if you don't, don't.

stupidcucumber · 2023-09-28T18:10:47Z

If you look closely on the "light_track.onnx" architecture template branch – is that thing starting from the "input_z" and including "Add" operation at the bottom. There is also "search branch", which works the same, but for "input_x".

Input data for the "features_extractor_x.onnx" and "features_extractor_z.onnx" is the images (of size (256, 256) and (128, 128) respectively). Then after inference of those two models I get extracted features from the "image_x" and "image_z" respectively. Considering that I have already quantized those model I have them as int8 dtype and that's ok.

The thing is I just lost in what MEAN and STD really do under the hood, and why it is necessary to include them even for non-image data? And what dimension should I provide for inputs [1, 8, 8, 96]? I tried using ellipsis "[[[1,...,1]]]" but I get an error "float() cannot take ellipsis".

My problem is very similar to issue #222 , but I have hard time understanding how to pass MEAN and STD. What happens if I will pass MEAN [1] and STD [0] internally for the input of shape [1, 8, 8, 96]?

PINTO0309 · 2023-09-30T03:42:08Z

I need to cut this here

If you have read the README seriously and still don't know how to do this, please tell me what is difficult to understand.

https://github.com/PINTO0309/onnx2tf#2-run-test

onnx2tf \
-i light_track.onnx \
-onimc /features/blocks/blocks.4/blocks.4.2/Add_output_0 \
-oiqt \
-qt per-tensor

Cut-out and quantized model
saved_model.zip

The above sample is converted anyway to see what you really want to do, and does not use the -cind option. I just cut the model at the specified position and at the same time quantize it.

I still don't understand what you are really having trouble with. For quantization, the input data for calibration should be normalized to the range of -1.0 to 1.0 or 0.0 - 1.0.

# input_data: Image or otherwise
# mean: Average of all input data
# std: Standard deviation

# For Image, 0.0-1.0, RGB: 0-255
input_data = input_data / 255.0

# For Non-Image, 0.0-1.0, Features to be re-entered for tracking: ???-???
input_data = input_data / {Number of possible values for input data}

calibration_data = (input_data - mean) / std

If MEAN and STD are not needed, simply specify zero for MEAN and one for STD.

calibration_data = \
    (input_data - np.zeros([1,2,3,4]), dtype=np.float32) / np.ones([1,2,3,4], dtype=np.float32)

It simply broadcasts the value specified by the user and performs subtraction and division. If you want to specify 1 million 1.0s, just write [1.0]. The README only abbreviated the list because it was silly to list 64 values, but if you wanted to specify 64 different values, you would actually need to specify 64 different values.

Since your question appears to be about how to use the quantization process in TensorFlow Lite, I feel it would be better to look at the official tutorial.

If you think you are calibrating correctly but the accuracy is significantly degraded, it is a problem with the structure of the model. Swish is catastrophically degraded in accuracy. If that is the kind of question, there is nothing this tool can do.

See: https://github.com/PINTO0309/onnx2tf#7-if-the-accuracy-of-the-int8-quantized-model-degrades-significantly

stupidcucumber · 2023-09-30T14:13:25Z

Wow, I have been reading documentation day and night and did not notice -onimc option... The model you quantized works fine. Thanks a lot!

As for the calibration data I also understood. I have already solved this problem, It was just me not properly understanding how to path MEAN and STD to the option -cind and I have got really confused when encountered problem with ellipsis :(

Thanks for your time! I guess issue can be closed

PINTO0309 added the Quantization Quantization label Sep 28, 2023

stupidcucumber closed this as completed Sep 30, 2023

ramonhollands mentioned this issue Jun 1, 2024

RTMDet int8 quantization #644

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slicing model into parts and separately quantizing each of them. #520

Slicing model into parts and separately quantizing each of them. #520

stupidcucumber commented Sep 28, 2023

PINTO0309 commented Sep 28, 2023 •

edited

Loading

stupidcucumber commented Sep 28, 2023

PINTO0309 commented Sep 30, 2023 •

edited

Loading

stupidcucumber commented Sep 30, 2023

Slicing model into parts and separately quantizing each of them. #520

Slicing model into parts and separately quantizing each of them. #520

Comments

stupidcucumber commented Sep 28, 2023

Issue Type

OS

onnx2tf version number

onnx version number

onnxruntime version number

onnxsim (onnx_simplifier) version number

tensorflow version number

Download URL for ONNX

Parameter Replacement JSON

Description

PINTO0309 commented Sep 28, 2023 • edited Loading

stupidcucumber commented Sep 28, 2023

PINTO0309 commented Sep 30, 2023 • edited Loading

stupidcucumber commented Sep 30, 2023

PINTO0309 commented Sep 28, 2023 •

edited

Loading

PINTO0309 commented Sep 30, 2023 •

edited

Loading