Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems evaluating INT8 quantized TFLite model #1638

Closed
mikel-brostrom opened this issue Mar 11, 2023 · 8 comments
Closed

Problems evaluating INT8 quantized TFLite model #1638

mikel-brostrom opened this issue Mar 11, 2023 · 8 comments

Comments

@mikel-brostrom
Copy link

mikel-brostrom commented Mar 11, 2023

I have managed to generate : dynamic_range_quant, full_integer_quant and integer_quant versions of the TFLite model using onn2tf. However the postprocessing fails for some reason. The confidences are so low that none of the predictions passes through the filtering. Any idea what could be the problem? The float16 and float32 TFLite models works as usual, achieving the result in the table below. Anybody tried onn2tf and got the models working?

@mikel-brostrom
Copy link
Author

mikel-brostrom commented Mar 11, 2023

Btw, I see quite a lot of people asking about exported model results so I though I could post mine here. I exported them with the decode-in-inference flag in order to minimize model output post-processing. I also had to built a multi-backend class that supported inference of all of the exported models to achieve a meaningful comparison by using exactly the same evaluation pipeline available in this repo for all of them. My results are as follow:

Model size mAPval
0.5:0.95
mAPval
0.5
YOLOX-nano PyTorch 416 0.256 0.411
YOLOX-nano ONNX 416 0.256 0.411
YOLOX-nano TFLite FP32 416 0.256 0.411

@mikel-brostrom mikel-brostrom changed the title What are some good TFLite INT8 results on COCOval2017? Problems evaluating INT8 quantized TFLite model Mar 11, 2023
@mikel-brostrom
Copy link
Author

For anybody interested in why this is the case, we are discussing this here: https://github.com/PINTO0309/onnx2tf/issues/244

@mikel-brostrom
Copy link
Author

mikel-brostrom commented Mar 13, 2023

This seems to be a known critical TF issue. Basically all quantized models don't work when exporting to TFLite by: PyTorch -- (torch.onnx.export) --> ONNX -- (onnx2tf v onnx-tf) --> TFlite. Not sure if this is only the case for model exported by this pipeline or if it is in general. Maybe somebody knows?

@PINTO0309
Copy link
Contributor

Since Float32 is working fine, it is odd that only the INT8 model would break if the Keras model object used to generate the INT8 model in the backend of the tool is the same. YOLOv8 broke the same way. Thus, I can even presume that it is not a conversion flow issue. PyTorch -> ONNX -> TFLite

@mikel-brostrom
Copy link
Author

Thanks for the insights @PINTO0309 😄

@PINTO0309
Copy link
Contributor

PINTO0309 commented Mar 20, 2023

For the benefit of other engineers' knowledge, I will also post in this thread the workaround needed to eliminate the accuracy degradation due to quantization. It seems that we need to rethink the activation function, etc. significantly and redefine another YOLOX-alpha like model that is not YOLOX to make it work. Thus, differences in the route of conversion were not related to accuracy degradation. SiLU (Swish) was found to significantly degrade the accuracy of the model during quantization. As an additional research reference, HardSwish also seems to cause significant accuracy degradation during quantization, as does SiLU (Swish).


It is a matter of model structure. The activation function, kernel size and stride for Pooling, and kernel size and stride for Conv should be completely revised. See: https://github.com/PINTO0309/onnx2tf/issues/244#issuecomment-1475128445

  • e.g. YOLOv8 https://docs.openvino.ai/latest/notebooks/230-yolov8-optimization-with-output.html

  • e.g. YOLOX-Nano https://github.com/TexasInstruments/edgeai-yolox

    Before After
    Swish/SiLU
    image
    ReLU
    image
    DepthwiseConv2D
    image
    Conv2D
    image
    MaxPool, kernel_size=5x5,9x9,13x13
    image
    MaxPool, kernel_size=3x3
    image
    ### Float32 - YOLOX-Nano
    (1, 52, 52, 85)
    array([[[
        [ 0.971787,  0.811184,  0.550566, ..., -5.962632, -7.403673, -6.735206],
        [ 0.858804,  1.351296,  1.231673, ..., -6.479690, -8.277064, -7.664936],
        [ 0.214827,  1.035119,  1.458006, ..., -6.291425, -8.229385, -7.761562],
            ...,
        [ 0.450116,  1.391900,  1.533354, ..., -5.672194, -7.121591, -6.880231],
        [ 0.593133,  2.112723,  0.968755, ..., -6.150078, -7.370633, -6.874294],
        [ 0.088263,  1.985220,  0.619998, ..., -5.507928, -6.914980, -6.234259]]]]),
    
    ### INT8 - YOLOX-Nano
    (1, 52, 52, 85)
    array([[[
        [ 0.941908,  0.770652,  0.513768, ..., -5.993958, -7.449634, -6.850238],
        [ 0.856280,  1.284420,  1.198792, ..., -6.507727, -8.391542, -7.792146],
        [ 0.256884,  0.941908,  1.455676, ..., -6.336471, -8.305914, -7.877774],
            ...,
        [ 0.342512,  1.370048,  1.541304, ..., -5.737075, -7.192750, -7.107122],
        [ 0.513768,  2.226327,  1.027536, ..., -6.165215, -7.449634, -7.021494],
        [ 0.085628,  2.055072,  0.685024, ..., -5.480191, -7.021494, -6.422099]]]]),
    

@PINTO0309
Copy link
Contributor

image

@mikel-brostrom
Copy link
Author

This got solved here: PINTO0309/onnx2tf#269. Closing this down!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants