Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slicing model into parts and separately quantizing each of them. #520

Closed
stupidcucumber opened this issue Sep 28, 2023 · 4 comments
Closed
Labels
Quantization Quantization

Comments

@stupidcucumber
Copy link

Issue Type

Others

OS

Linux

onnx2tf version number

1.14.5

onnx version number

1.14.0

onnxruntime version number

1.16.0

onnxsim (onnx_simplifier) version number

0.4.33

tensorflow version number

2.13.0

Download URL for ONNX

https://drive.google.com/drive/folders/1A1S9_dcW-ZQ5ggnZh4noLan-TqflrhXA?usp=share_link

Parameter Replacement JSON

{}

Description

Could you please clarify how exactly I need to build representative dataset for quantization with non-image data?

As an example, I have a tensor containing all features extracted from of the already int8 quantized template branch (a.k.a 'feature_extractor_z') with the following specifications: tensor(shape=[1, 8, 8, 96], dtype=int8). Now I can dequantize it by using that metadata in the output (zero_point and scale), but is it necessary? On the other hand, I can use those int8 features as is and build whole calibration dataset out of them only.

But as for the last sentence, I have to provide proper MEAN and STD to the -cind option. I successfully quantized features_z with [[[0.485, 0.456, 0.406]]], [[[0.229, 0.224, 0.225]]] arguments, but with non-image is the whole new story! I tried to pass [0] and [1], because I don't need any scaling of the data and I only need to pass outputs from the quantized features_extractor_z to features_z inputs in the RPN as if they were one quantized part. However, it didn't work out:(

Sorry to bother you, but this problem seems to be harder than I thought and I have already killed two days trying to understand this specific part.

@PINTO0309
Copy link
Owner

PINTO0309 commented Sep 28, 2023

Have you read and tried the following? Frankly, there is little more advice I can give you. The scale and mean values of the original quantized model do not matter. If you know the art of inferring the type of input data from the structure of the model alone, please let me know. What is a template branch?

https://github.com/PINTO0309/onnx2tf#9-int8-quantization-of-models-with-multiple-inputs-requiring-non-image-data

I can't read at all from the structure of the model alone what the input data for the onnx that you shared means. If you need to normalize input data, do it; if you don't, don't.

@PINTO0309 PINTO0309 added the Quantization Quantization label Sep 28, 2023
@stupidcucumber
Copy link
Author

If you look closely on the "light_track.onnx" architecture template branch – is that thing starting from the "input_z" and including "Add" operation at the bottom. There is also "search branch", which works the same, but for "input_x".

where need to cut

Input data for the "features_extractor_x.onnx" and "features_extractor_z.onnx" is the images (of size (256, 256) and (128, 128) respectively). Then after inference of those two models I get extracted features from the "image_x" and "image_z" respectively. Considering that I have already quantized those model I have them as int8 dtype and that's ok.

The thing is I just lost in what MEAN and STD really do under the hood, and why it is necessary to include them even for non-image data? And what dimension should I provide for inputs [1, 8, 8, 96]? I tried using ellipsis "[[[1,...,1]]]" but I get an error "float() cannot take ellipsis".

My problem is very similar to issue #222 , but I have hard time understanding how to pass MEAN and STD. What happens if I will pass MEAN [1] and STD [0] internally for the input of shape [1, 8, 8, 96]?

@PINTO0309
Copy link
Owner

PINTO0309 commented Sep 30, 2023

I need to cut this here

If you have read the README seriously and still don't know how to do this, please tell me what is difficult to understand.

https://github.com/PINTO0309/onnx2tf#2-run-test

image

image

image

onnx2tf \
-i light_track.onnx \
-onimc /features/blocks/blocks.4/blocks.4.2/Add_output_0 \
-oiqt \
-qt per-tensor

image

The above sample is converted anyway to see what you really want to do, and does not use the -cind option. I just cut the model at the specified position and at the same time quantize it.

image

I still don't understand what you are really having trouble with. For quantization, the input data for calibration should be normalized to the range of -1.0 to 1.0 or 0.0 - 1.0.

# input_data: Image or otherwise
# mean: Average of all input data
# std: Standard deviation

# For Image, 0.0-1.0, RGB: 0-255
input_data = input_data / 255.0

# For Non-Image, 0.0-1.0, Features to be re-entered for tracking: ???-???
input_data = input_data / {Number of possible values for input data}

calibration_data = (input_data - mean) / std

If MEAN and STD are not needed, simply specify zero for MEAN and one for STD.

calibration_data = \
    (input_data - np.zeros([1,2,3,4]), dtype=np.float32) / np.ones([1,2,3,4], dtype=np.float32)

It simply broadcasts the value specified by the user and performs subtraction and division. If you want to specify 1 million 1.0s, just write [1.0]. The README only abbreviated the list because it was silly to list 64 values, but if you wanted to specify 64 different values, you would actually need to specify 64 different values.

image

Since your question appears to be about how to use the quantization process in TensorFlow Lite, I feel it would be better to look at the official tutorial.

If you think you are calibrating correctly but the accuracy is significantly degraded, it is a problem with the structure of the model. Swish is catastrophically degraded in accuracy. If that is the kind of question, there is nothing this tool can do.

See: https://github.com/PINTO0309/onnx2tf#7-if-the-accuracy-of-the-int8-quantized-model-degrades-significantly

image

@stupidcucumber
Copy link
Author

Wow, I have been reading documentation day and night and did not notice -onimc option... The model you quantized works fine. Thanks a lot!

As for the calibration data I also understood. I have already solved this problem, It was just me not properly understanding how to path MEAN and STD to the option -cind and I have got really confused when encountered problem with ellipsis :(

Thanks for your time! I guess issue can be closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Quantization Quantization
Projects
None yet
Development

No branches or pull requests

2 participants