Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TVM RPC error on Khadas VIM3 pro #586

Open
fumao13579 opened this issue May 15, 2023 · 3 comments
Open

TVM RPC error on Khadas VIM3 pro #586

fumao13579 opened this issue May 15, 2023 · 3 comments

Comments

@fumao13579
Copy link

fumao13579 commented May 15, 2023

https://github.com/VeriSilicon/tvm/blob/vsi_npu/tests/python/contrib/test_vsi_npu/test_vsi_pytorch_model_all.py

I built and cross-compiled for Khadas VIM3 pro through tvm and TIM-VX 1.1.42 on x86_64 simulator. Running the test_vsi_tflite_model_all.py script to test mobilenet_v1_1.0_224_quant.tflite can be executed normally, but there are empty outputs when switching to other tflite models.

TIM-VX version:1.1.42
Using the aarch64_A311D_6.4.10.2 on Khadas VIM3 pro.
x86_64_linux prebuilt-sdk version:6.4.10.2

TVM Branch commit id: b822ec32702e2676dce1e430221e8efc05c98935

Here is the Galcore version on Khadas VIM3 pro. (6.4.6.2)

Details

khadas@Khadas:~$ sudo dmesg |grep -i galcore
[sudo] password for khadas:
[    0.000000] OF: reserved mem: initialized node linux,galcore, compatible id shared-dma-pool
[   15.953889] galcore irq number is 36.
[   15.953891] Galcore version 6.4.0.229426
[   36.656753] galcore: no symbol version for module_layout
[   36.656799] galcore: loading out-of-tree module taints kernel.
[   36.682762] galcore irq number is 36.
[   36.682770] Galcore version 6.4.6.2
[  795.670707] [galcore]: GPU[0] hang, automatic recovery.
[  795.675268] [galcore]: recovery done
[  857.110592] [galcore]: GPU[0] hang, automatic recovery.
[  857.115242] [galcore]: recovery done
[  959.510416] [galcore]: GPU[0] hang, automatic recovery.
[  959.515050] [galcore]: recovery done
[ 1020.950307] [galcore]: GPU[0] hang, automatic recovery.
[ 1020.954999] [galcore]: recovery done
[30594.015764] [galcore]: GPU[0] hang, automatic recovery.
[30594.020508] [galcore]: recovery done
[30655.455503] [galcore]: GPU[0] hang, automatic recovery.
[30655.460118] [galcore]: recovery done

Unit tests run successfully on Khadas VIM3 pro.

Details

khadas@Khadas:~/TIM-VX-1.1.42/build/install/bin$ ./unit_test 
Running main() from /home/niuniu/TIM-VX-1.1.42/build/_deps/googletest-src/googletest/src/gtest_main.cc
[==========] Running 175 tests from 59 test suites.
[----------] Global test environment set-up.
[----------] 1 test from compile_option
[ RUN      ] compile_option.relax_mode
[       OK ] compile_option.relax_mode (1 ms)
[----------] 1 test from compile_option (1 ms total)

[----------] 1 test from Context
[ RUN      ] Context.create
[       OK ] Context.create (43 ms)
[----------] 1 test from Context (44 ms total)

[----------] 2 tests from graph
[ RUN      ] graph.gen_binary_graph_with_empty_graph
E [/home/niuniu/TIM-VX-1.1.42/src/tim/vx/internal/src/vsi_nn_graph_optimization.c:_graph_optimization_convert_int8_to_uint8:837]CHECK STATUS(-1:A generic error code, used when no other describes the error.)
E [/home/niuniu/TIM-VX-1.1.42/src/tim/vx/internal/src/vsi_nn_graph_optimization.c:vsi_nn_OptimizeGraph:872]CHECK STATUS(-1:A generic error code, used when no other describes the error.)
[       OK ] graph.gen_binary_graph_with_empty_graph (7 ms)
[ RUN      ] graph.gen_binary_graph_with_simple_add
[       OK ] graph.gen_binary_graph_with_simple_add (69 ms)
[----------] 2 tests from graph (77 ms total)

[----------] 2 tests from Linear
[ RUN      ] Linear.shape_5_1_fp32
[       OK ] Linear.shape_5_1_fp32 (17 ms)
[ RUN      ] Linear.shape_5_1_fp32_omit_b
[       OK ] Linear.shape_5_1_fp32_omit_b (32 ms)
[----------] 2 tests from Linear (51 ms total)

[----------] 2 tests from Gelu
[ RUN      ] Gelu.shape_5_1_fp32_approximate
[       OK ] Gelu.shape_5_1_fp32_approximate (225 ms)
[ RUN      ] Gelu.shape_5_1_uint8_Quantized
[       OK ] Gelu.shape_5_1_uint8_Quantized (11 ms)
[----------] 2 tests from Gelu (237 ms total)

[----------] 1 test from HardSigmoid
[ RUN      ] HardSigmoid.shape_5_1_uint8_Quantized
[       OK ] HardSigmoid.shape_5_1_uint8_Quantized (7 ms)
[----------] 1 test from HardSigmoid (7 ms total)

[----------] 1 test from Elu
[ RUN      ] Elu.shape_5_1_fp32
[       OK ] Elu.shape_5_1_fp32 (74 ms)
[----------] 1 test from Elu (74 ms total)

[----------] 3 tests from AddN
[ RUN      ] AddN.shape_2_2_int32
[       OK ] AddN.shape_2_2_int32 (22 ms)
[ RUN      ] AddN.shape_3_1_float32
[       OK ] AddN.shape_3_1_float32 (12 ms)
[ RUN      ] AddN.shape_2_2_uint8_Quantized
[       OK ] AddN.shape_2_2_uint8_Quantized (70 ms)
[----------] 3 tests from AddN (104 ms total)

[----------] 2 tests from ArgMax
[ RUN      ] ArgMax.shape_2_2_axis_0
[       OK ] ArgMax.shape_2_2_axis_0 (58 ms)
[ RUN      ] ArgMax.shape_2_2_axis_1
[       OK ] ArgMax.shape_2_2_axis_1 (56 ms)
[----------] 2 tests from ArgMax (115 ms total)

[----------] 2 tests from ArgMin
[ RUN      ] ArgMin.shape_2_2_axis_0
[       OK ] ArgMin.shape_2_2_axis_0 (68 ms)
[ RUN      ] ArgMin.shape_2_2_axis_1
[       OK ] ArgMin.shape_2_2_axis_1 (56 ms)
[----------] 2 tests from ArgMin (124 ms total)

[----------] 4 tests from AVG
[ RUN      ] AVG.shape_3_3_1_2_fp32_kernel_2_stride_1
[       OK ] AVG.shape_3_3_1_2_fp32_kernel_2_stride_1 (69 ms)
[ RUN      ] AVG.shape_3_3_1_1_fp32_kernel_2_stride_1
[       OK ] AVG.shape_3_3_1_1_fp32_kernel_2_stride_1 (12 ms)
[ RUN      ] AVG.shape_3_3_1_1_uint8_kernel_2_stride_1
[       OK ] AVG.shape_3_3_1_1_uint8_kernel_2_stride_1 (7 ms)
[ RUN      ] AVG.shape_60_52_3_5_fp32_kernel_35_stride_5
[       OK ] AVG.shape_60_52_3_5_fp32_kernel_35_stride_5 (45 ms)
[----------] 4 tests from AVG (135 ms total)

[----------] 2 tests from AVG_ANDROID
[ RUN      ] AVG_ANDROID.shape_60_52_3_5_fp32_kernel_35_stride_5
[       OK ] AVG_ANDROID.shape_60_52_3_5_fp32_kernel_35_stride_5 (54 ms)
[ RUN      ] AVG_ANDROID.shape_60_52_3_5_uint8_kernel_35_stride_5
[       OK ] AVG_ANDROID.shape_60_52_3_5_uint8_kernel_35_stride_5 (59 ms)
[----------] 2 tests from AVG_ANDROID (113 ms total)

[----------] 1 test from Batch2Space
[ RUN      ] Batch2Space.shape_1_1_3_4_fp32_whcn
[       OK ] Batch2Space.shape_1_1_3_4_fp32_whcn (7 ms)
[----------] 1 test from Batch2Space (7 ms total)

[----------] 2 tests from BatchNorm
[ RUN      ] BatchNorm.shape_3_3_2_1_fp32_cwhn
[       OK ] BatchNorm.shape_3_3_2_1_fp32_cwhn (103 ms)
[ RUN      ] BatchNorm.shape_3_3_2_1_fp32_whcn
[       OK ] BatchNorm.shape_3_3_2_1_fp32_whcn (10 ms)
[----------] 2 tests from BatchNorm (114 ms total)

[----------] 3 tests from Conv1d
[ RUN      ] Conv1d.shape_3_6_1_float_ksize_1_stride_1_weights_3_no_bias_wcn
[       OK ] Conv1d.shape_3_6_1_float_ksize_1_stride_1_weights_3_no_bias_wcn (58 ms)
[ RUN      ] Conv1d.shape_6_2_1_uint8_ksize_6_stride_1_weights_2_wcn
[       OK ] Conv1d.shape_6_2_1_uint8_ksize_6_stride_1_weights_2_wcn (11 ms)
[ RUN      ] Conv1d.shape_6_2_1_uint8_ksize_3_stride_1_pad_1_weights_2_no_bias_wcn
[       OK ] Conv1d.shape_6_2_1_uint8_ksize_3_stride_1_pad_1_weights_2_no_bias_wcn (8 ms)
[----------] 3 tests from Conv1d (77 ms total)

[----------] 20 tests from Conv2d
[ RUN      ] Conv2d.shape_4_2_1_1_float32_PaddingTest
[       OK ] Conv2d.shape_4_2_1_1_float32_PaddingTest (60 ms)
[ RUN      ] Conv2d.shape_4_2_2_2_float32_PointwiseTest
[       OK ] Conv2d.shape_4_2_2_2_float32_PointwiseTest (61 ms)
[ RUN      ] Conv2d.shape_4_2_1_2_float32_SimpleTest
[       OK ] Conv2d.shape_4_2_1_2_float32_SimpleTest (22 ms)
[ RUN      ] Conv2d.shape_4_2_2_2_float32_SimpleChannelsTest
[       OK ] Conv2d.shape_4_2_2_2_float32_SimpleChannelsTest (32 ms)
[ RUN      ] Conv2d.shape_6_3_1_1_float32_SimpleAnisotropicStridesTest
[       OK ] Conv2d.shape_6_3_1_1_float32_SimpleAnisotropicStridesTest (20 ms)
[ RUN      ] Conv2d.shape_4_3_1_1_float32_HandCalculatedTest
[       OK ] Conv2d.shape_4_3_1_1_float32_HandCalculatedTest (16 ms)
[ RUN      ] Conv2d.shape_4_3_1_1_float32_HandCalculatedConstFilterTest
[       OK ] Conv2d.shape_4_3_1_1_float32_HandCalculatedConstFilterTest (16 ms)
[ RUN      ] Conv2d.shape_4_3_1_1_float32_HandCalculatedBiasTest
[       OK ] Conv2d.shape_4_3_1_1_float32_HandCalculatedBiasTest (24 ms)
[ RUN      ] Conv2d.shape_4_3_1_1_float32_HandCalculatedValidTest
[       OK ] Conv2d.shape_4_3_1_1_float32_HandCalculatedValidTest (26 ms)
[ RUN      ] Conv2d.shape_4_2_2_2_float32_DisabledPointwiseMultifilterTest
[       OK ] Conv2d.shape_4_2_2_2_float32_DisabledPointwiseMultifilterTest (13 ms)
[ RUN      ] Conv2d.shape_9_9_1_1_float32_SimpleDilationTest
[       OK ] Conv2d.shape_9_9_1_1_float32_SimpleDilationTest (19 ms)
[ RUN      ] Conv2d.shape_4_2_1_2_float32_StrideTest
[       OK ] Conv2d.shape_4_2_1_2_float32_StrideTest (16 ms)
[ RUN      ] Conv2d.shape_4_2_1_2_float32_InputAndFilterSameWidthHeightTest
[       OK ] Conv2d.shape_4_2_1_2_float32_InputAndFilterSameWidthHeightTest (14 ms)
[ RUN      ] Conv2d.shape_4_2_1_2_uint8_QuantizedTest1
[       OK ] Conv2d.shape_4_2_1_2_uint8_QuantizedTest1 (10 ms)
[ RUN      ] Conv2d.shape_4_2_1_2_uint8_QuantizedTest2
[       OK ] Conv2d.shape_4_2_1_2_uint8_QuantizedTest2 (28 ms)
[ RUN      ] Conv2d.shape_6_3_1_1_uint8_AnisotropicStridesQuantizedTest
[       OK ] Conv2d.shape_6_3_1_1_uint8_AnisotropicStridesQuantizedTest (14 ms)
[ RUN      ] Conv2d.shape_9_9_1_1_uint8_DilationQuantizedTest
[       OK ] Conv2d.shape_9_9_1_1_uint8_DilationQuantizedTest (12 ms)
[ RUN      ] Conv2d.shape_3_2_2_1_int8_QuantizedPerTensorTest
[       OK ] Conv2d.shape_3_2_2_1_int8_QuantizedPerTensorTest (136 ms)
[ RUN      ] Conv2d.shape_3_2_2_1_int8_QuantizedPerChannelTest
[       OK ] Conv2d.shape_3_2_2_1_int8_QuantizedPerChannelTest (488 ms)
[ RUN      ] Conv2d.shape_w_h_128_1_ksize_1_1_stride_2_int8_QuantizedPerChannelTest
[       OK ] Conv2d.shape_w_h_128_1_ksize_1_1_stride_2_int8_QuantizedPerChannelTest (8720 ms)
[----------] 20 tests from Conv2d (9751 ms total)

[----------] 2 tests from Conv3d
[ RUN      ] Conv3d.shape_1_1_2_3_3_float32_simple_whdcn
[       OK ] Conv3d.shape_1_1_2_3_3_float32_simple_whdcn (61 ms)
[ RUN      ] Conv3d.shape_1_1_2_3_3_float32_simple_cwhdn
[       OK ] Conv3d.shape_1_1_2_3_3_float32_simple_cwhdn (29 ms)
[----------] 2 tests from Conv3d (90 ms total)

[----------] 2 tests from DeConv1d
[ RUN      ] DeConv1d.no_bias_layout_whcn_depthwise_shape_3_2_1
[       OK ] DeConv1d.no_bias_layout_whcn_depthwise_shape_3_2_1 (87 ms)
[ RUN      ] DeConv1d.layout_whcn_shape_3_1_1
[       OK ] DeConv1d.layout_whcn_shape_3_1_1 (83 ms)
[----------] 2 tests from DeConv1d (170 ms total)

[----------] 2 tests from DeConv2d
[ RUN      ] DeConv2d.shape_3_3_2_1_float_depthwise
[       OK ] DeConv2d.shape_3_3_2_1_float_depthwise (17 ms)
[ RUN      ] DeConv2d.shape_3_3_1_1_float
[       OK ] DeConv2d.shape_3_3_1_1_float (14 ms)
[----------] 2 tests from DeConv2d (31 ms total)

[----------] 16 tests from DepthwiseConv
[ RUN      ] DepthwiseConv.shape_2_3_2_1_float32_SimpleTest
[       OK ] DepthwiseConv.shape_2_3_2_1_float32_SimpleTest (34 ms)
[ RUN      ] DepthwiseConv.shape_2_3_2_1_float32_StrideValidTest
[       OK ] DepthwiseConv.shape_2_3_2_1_float32_StrideValidTest (28 ms)
[ RUN      ] DepthwiseConv.shape_2_3_2_1_float32_StrideSameTest
[       OK ] DepthwiseConv.shape_2_3_2_1_float32_StrideSameTest (39 ms)
[ RUN      ] DepthwiseConv.shape_2_3_2_1_float32_StrideSameDilationTest
[       OK ] DepthwiseConv.shape_2_3_2_1_float32_StrideSameDilationTest (26 ms)
[ RUN      ] DepthwiseConv.shape_2_3_2_1_float32_PaddingTest
[       OK ] DepthwiseConv.shape_2_3_2_1_float32_PaddingTest (19 ms)
[ RUN      ] DepthwiseConv.shape_9_9_1_1_float32_DilationValidTest
[       OK ] DepthwiseConv.shape_9_9_1_1_float32_DilationValidTest (17 ms)
[ RUN      ] DepthwiseConv.shape_3_3_1_1_float32_DilationSameTest
[       OK ] DepthwiseConv.shape_3_3_1_1_float32_DilationSameTest (18 ms)
[ RUN      ] DepthwiseConv.shape_3_3_4_2_float32_BatchValidTest
[       OK ] DepthwiseConv.shape_3_3_4_2_float32_BatchValidTest (42 ms)
[ RUN      ] DepthwiseConv.shape_2_2_1_4_float32_BatchSameTest
[       OK ] DepthwiseConv.shape_2_2_1_4_float32_BatchSameTest (28 ms)
[ RUN      ] DepthwiseConv.shape_2_3_2_1_uint8_QuantizedTest
[       OK ] DepthwiseConv.shape_2_3_2_1_uint8_QuantizedTest (24 ms)
[ RUN      ] DepthwiseConv.shape_9_9_1_1_uint8_QuantizedDilationdValidTest
[       OK ] DepthwiseConv.shape_9_9_1_1_uint8_QuantizedDilationdValidTest (17 ms)
[ RUN      ] DepthwiseConv.shape_3_3_1_1_uint8_QuantizedDilationdSameTest
[       OK ] DepthwiseConv.shape_3_3_1_1_uint8_QuantizedDilationdSameTest (15 ms)
[ RUN      ] DepthwiseConv.shape_3_2_2_1_int8_PerTensorTest
[       OK ] DepthwiseConv.shape_3_2_2_1_int8_PerTensorTest (28 ms)
[ RUN      ] DepthwiseConv.shape_3_2_2_1_int8_PerAxisTest
[       OK ] DepthwiseConv.shape_3_2_2_1_int8_PerAxisTest (141 ms)
[ RUN      ] DepthwiseConv.shape_3_3_8_1_int8_PerChannelValidTest
[       OK ] DepthwiseConv.shape_3_3_8_1_int8_PerChannelValidTest (37 ms)
[ RUN      ] DepthwiseConv.shape_3_3_8_1_int8_PerChannelSameTest
[       OK ] DepthwiseConv.shape_3_3_8_1_int8_PerChannelSameTest (42 ms)
[----------] 16 tests from DepthwiseConv (558 ms total)

[----------] 3 tests from FloorDiv
[ RUN      ] FloorDiv.shape_1_fp32
[       OK ] FloorDiv.shape_1_fp32 (245 ms)
[ RUN      ] FloorDiv.shape_5_1_broadcast_float32
[       OK ] FloorDiv.shape_5_1_broadcast_float32 (78 ms)
[ RUN      ] FloorDiv.shape_5_1_broadcast_uint8
[       OK ] FloorDiv.shape_5_1_broadcast_uint8 (343 ms)
[----------] 3 tests from FloorDiv (666 ms total)

[----------] 4 tests from Div
[ RUN      ] Div.shape_1_fp32
[       OK ] Div.shape_1_fp32 (26 ms)
[ RUN      ] Div.shape_5_1_broadcast_uint8
[       OK ] Div.shape_5_1_broadcast_uint8 (206 ms)
[ RUN      ] Div.shape_5_1_broadcast_scale_uint8
[       OK ] Div.shape_5_1_broadcast_scale_uint8 (43 ms)
[ RUN      ] Div.Div_uint8
[       OK ] Div.Div_uint8 (62 ms)
[----------] 4 tests from Div (338 ms total)

[----------] 2 tests from Erf
[ RUN      ] Erf.shape_3_2_fp32
[       OK ] Erf.shape_3_2_fp32 (79 ms)
[ RUN      ] Erf.shape_3_2_uint8_Quantized
[       OK ] Erf.shape_3_2_uint8_Quantized (15 ms)
[----------] 2 tests from Erf (95 ms total)

[----------] 2 tests from GroupedConv1d
[ RUN      ] GroupedConv1d.shape_6_2_1_float_ksize_6_stride_1_group_2_no_bias_wcn
[       OK ] GroupedConv1d.shape_6_2_1_float_ksize_6_stride_1_group_2_no_bias_wcn (15 ms)
[ RUN      ] GroupedConv1d.shape_6_2_1_float_ksize_6_stride_1_group_2_no_bias_wcn_PaddingTest
[       OK ] GroupedConv1d.shape_6_2_1_float_ksize_6_stride_1_group_2_no_bias_wcn_PaddingTest (17 ms)
[----------] 2 tests from GroupedConv1d (32 ms total)

[----------] 3 tests from GroupedConv2d
[ RUN      ] GroupedConv2d.shape_3_3_6_1_float_group_1_no_bias_whcn
[       OK ] GroupedConv2d.shape_3_3_6_1_float_group_1_no_bias_whcn (13 ms)
[ RUN      ] GroupedConv2d.shape_3_3_6_1_float_group_2_whcn
[       OK ] GroupedConv2d.shape_3_3_6_1_float_group_2_whcn (14 ms)
[ RUN      ] GroupedConv2d.shape_3_3_6_1_uint8_group_6_whcn
[       OK ] GroupedConv2d.shape_3_3_6_1_uint8_group_6_whcn (31 ms)
[----------] 3 tests from GroupedConv2d (59 ms total)

[----------] 2 tests from InstanceNorm
[ RUN      ] InstanceNorm.shape_3_6_1_float
[       OK ] InstanceNorm.shape_3_6_1_float (131 ms)
[ RUN      ] InstanceNorm.shape_3_3_6_1_float
[       OK ] InstanceNorm.shape_3_3_6_1_float (124 ms)
[----------] 2 tests from InstanceNorm (256 ms total)

[----------] 2 tests from LayerNorm
[ RUN      ] LayerNorm.axis_0_shape_3_6_1_float
[       OK ] LayerNorm.axis_0_shape_3_6_1_float (83 ms)
[ RUN      ] LayerNorm.axis_0_shape_2_3_6_1_float
[       OK ] LayerNorm.axis_0_shape_2_3_6_1_float (86 ms)
[----------] 2 tests from LayerNorm (169 ms total)

[----------] 3 tests from LogSoftmax
[ RUN      ] LogSoftmax.shape_6_1_float_axis_0
[       OK ] LogSoftmax.shape_6_1_float_axis_0 (86 ms)
[ RUN      ] LogSoftmax.shape_3_6_1_float_axis_1
[       OK ] LogSoftmax.shape_3_6_1_float_axis_1 (72 ms)
[ RUN      ] LogSoftmax.shape_3_6_1_uint8_axis_1
[       OK ] LogSoftmax.shape_3_6_1_uint8_axis_1 (996 ms)
[----------] 3 tests from LogSoftmax (1154 ms total)

[----------] 4 tests from Matmul
[ RUN      ] Matmul.shape_2_6_shape_6_2_float
[       OK ] Matmul.shape_2_6_shape_6_2_float (50 ms)
[ RUN      ] Matmul.shape_3_1_shape_1_3_float
[       OK ] Matmul.shape_3_1_shape_1_3_float (73 ms)
[ RUN      ] Matmul.shape_2_3_2_shape_2_3_2_float_transpose_b
[       OK ] Matmul.shape_2_3_2_shape_2_3_2_float_transpose_b (69 ms)
[ RUN      ] Matmul.shape_2_3_2_shape_2_3_2_uint8_transpose_a
[       OK ] Matmul.shape_2_3_2_shape_2_3_2_uint8_transpose_a (235 ms)
[----------] 4 tests from Matmul (427 ms total)

[----------] 2 tests from MaxpoolWithArgmax
[ RUN      ] MaxpoolWithArgmax.shape_3_3_1_fp32_kernel_2_stride_2
[       OK ] MaxpoolWithArgmax.shape_3_3_1_fp32_kernel_2_stride_2 (93 ms)
[ RUN      ] MaxpoolWithArgmax.shape_4_4_1_uint8_kernel_2_stride_2
[       OK ] MaxpoolWithArgmax.shape_4_4_1_uint8_kernel_2_stride_2 (198 ms)
[----------] 2 tests from MaxpoolWithArgmax (292 ms total)

[----------] 2 tests from MaxUnpool2d
[ RUN      ] MaxUnpool2d.shape_2_2_1_fp32_kernel_2_stride_2
[       OK ] MaxUnpool2d.shape_2_2_1_fp32_kernel_2_stride_2 (77 ms)
[ RUN      ] MaxUnpool2d.shape_2_2_1_uint8_kernel_2_stride_2
[       OK ] MaxUnpool2d.shape_2_2_1_uint8_kernel_2_stride_2 (223 ms)
[----------] 2 tests from MaxUnpool2d (300 ms total)

[----------] 2 tests from Moments
[ RUN      ] Moments.shape_6_3_1_float_axes_0_1
[       OK ] Moments.shape_6_3_1_float_axes_0_1 (112 ms)
[ RUN      ] Moments.shape_3_6_1_float_axes_1_keepdims
[       OK ] Moments.shape_3_6_1_float_axes_1_keepdims (61 ms)
[----------] 2 tests from Moments (173 ms total)

[----------] 9 tests from OneHot
[ RUN      ] OneHot.shape_3_out_flaot_depth_3
[       OK ] OneHot.shape_3_out_flaot_depth_3 (52 ms)
[ RUN      ] OneHot.shape_3_out_int32_depth_3
[       OK ] OneHot.shape_3_out_int32_depth_3 (66 ms)
[ RUN      ] OneHot.shape_3_out_int8_depth_3
[       OK ] OneHot.shape_3_out_int8_depth_3 (51 ms)
[ RUN      ] OneHot.shape_3_out_uint8_depth_3
[       OK ] OneHot.shape_3_out_uint8_depth_3 (58 ms)
[ RUN      ] OneHot.shape_3_out_int32_depth_1
[       OK ] OneHot.shape_3_out_int32_depth_1 (94 ms)
[ RUN      ] OneHot.shape_3_out_int32_depth_4
[       OK ] OneHot.shape_3_out_int32_depth_4 (51 ms)
[ RUN      ] OneHot.shape_3_out_int32_depth_3_on_6_off_N1
[       OK ] OneHot.shape_3_out_int32_depth_3_on_6_off_N1 (54 ms)
[ RUN      ] OneHot.shape_3_out_int32_depth_3_on_5_off_0_axis_1
[       OK ] OneHot.shape_3_out_int32_depth_3_on_5_off_0_axis_1 (98 ms)
[ RUN      ] OneHot.shape_2_2_out_int32_depth_3_on_2_off_0
[       OK ] OneHot.shape_2_2_out_int32_depth_3_on_2_off_0 (51 ms)
[----------] 9 tests from OneHot (578 ms total)

[----------] 1 test from Equal
[ RUN      ] Equal.shape_1_uint8
[       OK ] Equal.shape_1_uint8 (634 ms)
[----------] 1 test from Equal (634 ms total)

[----------] 1 test from NotEqual
[ RUN      ] NotEqual.shape_5_fp32
[       OK ] NotEqual.shape_5_fp32 (80 ms)
[----------] 1 test from NotEqual (80 ms total)

[----------] 1 test from Less
[ RUN      ] Less.shape_5_1_fp32
[       OK ] Less.shape_5_1_fp32 (91 ms)
[----------] 1 test from Less (91 ms total)

[----------] 1 test from GreaterOrEqual
[ RUN      ] GreaterOrEqual.shape_5_2_1_fp32
[       OK ] GreaterOrEqual.shape_5_2_1_fp32 (134 ms)
[----------] 1 test from GreaterOrEqual (134 ms total)

[----------] 1 test from Greater
[ RUN      ] Greater.shape_5_2_1_1_fp32
[       OK ] Greater.shape_5_2_1_1_fp32 (98 ms)
[----------] 1 test from Greater (98 ms total)

[----------] 1 test from LessOrEqual
[ RUN      ] LessOrEqual.shape_1_5_2_1_1_fp32
[       OK ] LessOrEqual.shape_1_5_2_1_1_fp32 (107 ms)
[----------] 1 test from LessOrEqual (108 ms total)

[----------] 2 tests from Reorg
[ RUN      ] Reorg.shape_4_4_4_1_u8
[       OK ] Reorg.shape_4_4_4_1_u8 (12 ms)
[ RUN      ] Reorg.shape_4_4_4_1_fp32
[       OK ] Reorg.shape_4_4_4_1_fp32 (34 ms)
[----------] 2 tests from Reorg (47 ms total)

[----------] 3 tests from Resize1d
[ RUN      ] Resize1d.shape_4_2_1_float_nearest_whcn
[       OK ] Resize1d.shape_4_2_1_float_nearest_whcn (61 ms)
[ RUN      ] Resize1d.shape_4_2_1_uint8_nearest_whcn
[       OK ] Resize1d.shape_4_2_1_uint8_nearest_whcn (155 ms)
[ RUN      ] Resize1d.shape_5_1_1_float_bilinear_align_corners_whcn
[       OK ] Resize1d.shape_5_1_1_float_bilinear_align_corners_whcn (53 ms)
[----------] 3 tests from Resize1d (270 ms total)

[----------] 2 tests from RNNCell
[ RUN      ] RNNCell.shape_3_2_4_float
[       OK ] RNNCell.shape_3_2_4_float (178 ms)
[ RUN      ] RNNCell.seperate
[       OK ] RNNCell.seperate (79 ms)
[----------] 2 tests from RNNCell (257 ms total)

[----------] 2 tests from ScatterND
[ RUN      ] ScatterND.shape_4_4_4
[       OK ] ScatterND.shape_4_4_4 (97 ms)
[ RUN      ] ScatterND.shape_9
[       OK ] ScatterND.shape_9 (154 ms)
[----------] 2 tests from ScatterND (251 ms total)

[----------] 5 tests from ShuffleChannel
[ RUN      ] ShuffleChannel.shape_3_6_groupnum2_dim1_float32
[       OK ] ShuffleChannel.shape_3_6_groupnum2_dim1_float32 (26 ms)
[ RUN      ] ShuffleChannel.shape_4_2_2_groupnum2_dim0_float32
[       OK ] ShuffleChannel.shape_4_2_2_groupnum2_dim0_float32 (15 ms)
[ RUN      ] ShuffleChannel.shape_1_4_2_2_groupnum2_dim1_float32
[       OK ] ShuffleChannel.shape_1_4_2_2_groupnum2_dim1_float32 (23 ms)
[ RUN      ] ShuffleChannel.shape_4_1_2_2_groupnum4_dim0_float32
[       OK ] ShuffleChannel.shape_4_1_2_2_groupnum4_dim0_float32 (32 ms)
[ RUN      ] ShuffleChannel.shape_4_1_2_2_groupnum1_dim3_float32
[       OK ] ShuffleChannel.shape_4_1_2_2_groupnum1_dim3_float32 (26 ms)
[----------] 5 tests from ShuffleChannel (123 ms total)

[----------] 1 test from SignalFrame
[ RUN      ] SignalFrame.shape_10_3_float_step_2_windows_4
[       OK ] SignalFrame.shape_10_3_float_step_2_windows_4 (59 ms)
[----------] 1 test from SignalFrame (60 ms total)

[----------] 1 test from Floor
[ RUN      ] Floor.shape_5_1_fp32
[       OK ] Floor.shape_5_1_fp32 (12 ms)
[----------] 1 test from Floor (12 ms total)

[----------] 1 test from Cast
[ RUN      ] Cast.shape_5_1_fp32_to_int32
[       OK ] Cast.shape_5_1_fp32_to_int32 (58 ms)
[----------] 1 test from Cast (58 ms total)

[----------] 3 tests from DataConvert
[ RUN      ] DataConvert.quantize_shape_2_3_fp32_to_asym_u8
[       OK ] DataConvert.quantize_shape_2_3_fp32_to_asym_u8 (28 ms)
[ RUN      ] DataConvert.dequantize_shape_2_3_asym_u8_to_fp32
[       OK ] DataConvert.dequantize_shape_2_3_asym_u8_to_fp32 (22 ms)
[ RUN      ] DataConvert.requantize_shape_2_3_asym_u8
[       OK ] DataConvert.requantize_shape_2_3_asym_u8 (13 ms)
[----------] 3 tests from DataConvert (63 ms total)

[----------] 6 tests from Softmax
[ RUN      ] Softmax.shape_3_1_float_axis_0
[       OK ] Softmax.shape_3_1_float_axis_0 (30 ms)
[ RUN      ] Softmax.shape_3_4_float_axis_0
[       OK ] Softmax.shape_3_4_float_axis_0 (22 ms)
[ RUN      ] Softmax.shape_3_4_float_axis_1
[       OK ] Softmax.shape_3_4_float_axis_1 (23 ms)
[ RUN      ] Softmax.shape_3_3_2_float_axis_0
[       OK ] Softmax.shape_3_3_2_float_axis_0 (10 ms)
[ RUN      ] Softmax.shape_3_3_2_float_axis_1
[       OK ] Softmax.shape_3_3_2_float_axis_1 (13 ms)
[ RUN      ] Softmax.shape_3_3_2_float_axis_2
[       OK ] Softmax.shape_3_3_2_float_axis_2 (10 ms)
[----------] 6 tests from Softmax (108 ms total)

[----------] 1 test from Space2Batch
[ RUN      ] Space2Batch.shape_2_2_3_1_fp32_whcn
[       OK ] Space2Batch.shape_2_2_3_1_fp32_whcn (9 ms)
[----------] 1 test from Space2Batch (9 ms total)

[----------] 1 test from SpatialTransformer
[ RUN      ] SpatialTransformer.shape_1_3_3_1_u8
[       OK ] SpatialTransformer.shape_1_3_3_1_u8 (336 ms)
[----------] 1 test from SpatialTransformer (336 ms total)

[----------] 6 tests from Stack
[ RUN      ] Stack.shape_3_4_axis_2
[       OK ] Stack.shape_3_4_axis_2 (50 ms)
[ RUN      ] Stack.shape_3_4_axis_1
[       OK ] Stack.shape_3_4_axis_1 (19 ms)
[ RUN      ] Stack.shape_3_4_axis_0
[       OK ] Stack.shape_3_4_axis_0 (18 ms)
[ RUN      ] Stack.LayoutinferernceTest_1
[       OK ] Stack.LayoutinferernceTest_1 (27 ms)
[ RUN      ] Stack.LayoutinferernceTest_2
[       OK ] Stack.LayoutinferernceTest_2 (94 ms)
[ RUN      ] Stack.LayoutinferernceTest_3
[       OK ] Stack.LayoutinferernceTest_3 (138 ms)
[----------] 6 tests from Stack (349 ms total)

[----------] 1 test from StridedSlice
[ RUN      ] StridedSlice.shape_
[       OK ] StridedSlice.shape_ (45 ms)
[----------] 1 test from StridedSlice (45 ms total)

[----------] 3 tests from Svdf
[ RUN      ] Svdf.shape_3_2_10_1_4_float
[       OK ] Svdf.shape_3_2_10_1_4_float (20 ms)
[ RUN      ] Svdf.shape_3_2_10_2_4_float
[       OK ] Svdf.shape_3_2_10_2_4_float (24 ms)
[ RUN      ] Svdf.shape_3_2_10_3_4_float
[       OK ] Svdf.shape_3_2_10_3_4_float (21 ms)
[----------] 3 tests from Svdf (65 ms total)

[----------] 2 tests from Tile
[ RUN      ] Tile.shape_3_2_float_multiples_2_1
[       OK ] Tile.shape_3_2_float_multiples_2_1 (70 ms)
[ RUN      ] Tile.shape_3_2_1_int8_multiples_2_2_1
[       OK ] Tile.shape_3_2_1_int8_multiples_2_2_1 (390 ms)
[----------] 2 tests from Tile (461 ms total)

[----------] 14 tests from TransposeConv2d
[ RUN      ] TransposeConv2d.shape_4_4_1_1_float32_SimpleTest
[       OK ] TransposeConv2d.shape_4_4_1_1_float32_SimpleTest (22 ms)
[ RUN      ] TransposeConv2d.shape_4_4_2_1_float32_SameTest
[       OK ] TransposeConv2d.shape_4_4_2_1_float32_SameTest (17 ms)
[ RUN      ] TransposeConv2d.shape_4_4_2_1_float32_ValidTest
[       OK ] TransposeConv2d.shape_4_4_2_1_float32_ValidTest (20 ms)
[ RUN      ] TransposeConv2d.shape_2_2_1_1_float32_StrideTest
[       OK ] TransposeConv2d.shape_2_2_1_1_float32_StrideTest (28 ms)
[ RUN      ] TransposeConv2d.shape_2_2_1_1_float32_ChannelTest
[       OK ] TransposeConv2d.shape_2_2_1_1_float32_ChannelTest (16 ms)
[ RUN      ] TransposeConv2d.shape_2_1_1_1_float32_AccuracyTest
[       OK ] TransposeConv2d.shape_2_1_1_1_float32_AccuracyTest (26 ms)
[ RUN      ] TransposeConv2d.shape_2_2_1_1_float32_BiasChannelTest
[       OK ] TransposeConv2d.shape_2_2_1_1_float32_BiasChannelTest (36 ms)
[ RUN      ] TransposeConv2d.shape_4_4_1_1_uint8_QuantizedTest
[       OK ] TransposeConv2d.shape_4_4_1_1_uint8_QuantizedTest (23 ms)
[ RUN      ] TransposeConv2d.shape_4_4_2_1_uint8_QuantizedTwoFiltersTest
[       OK ] TransposeConv2d.shape_4_4_2_1_uint8_QuantizedTwoFiltersTest (25 ms)
[ RUN      ] TransposeConv2d.shape_4_4_2_1_uint8_QuantizedValidTest
[       OK ] TransposeConv2d.shape_4_4_2_1_uint8_QuantizedValidTest (26 ms)
[ RUN      ] TransposeConv2d.shape_4_4_1_1_uint8_QuantizedBiasTest
[       OK ] TransposeConv2d.shape_4_4_1_1_uint8_QuantizedBiasTest (17 ms)
[ RUN      ] TransposeConv2d.shape_4_4_1_1_int8_QuantizedPerChannelOneTest
[       OK ] TransposeConv2d.shape_4_4_1_1_int8_QuantizedPerChannelOneTest (102 ms)
[ RUN      ] TransposeConv2d.shape_2_2_1_1_int8_QuantizedPerChannelTwoTest
[       OK ] TransposeConv2d.shape_2_2_1_1_int8_QuantizedPerChannelTwoTest (120 ms)
[ RUN      ] TransposeConv2d.shape_4_4_1_1_int8_QuantizedBiasPerChannelTest
[       OK ] TransposeConv2d.shape_4_4_1_1_int8_QuantizedBiasPerChannelTest (101 ms)
[----------] 14 tests from TransposeConv2d (582 ms total)

[----------] 1 test from LSTM_CELL
[ RUN      ] LSTM_CELL.shape_in_2_cell_4_out_4_float32
W [downcast_act_type:46]Not supported activition type for LSTM = 0
[       OK ] LSTM_CELL.shape_in_2_cell_4_out_4_float32 (152 ms)
[----------] 1 test from LSTM_CELL (152 ms total)

[----------] 2 tests from Unstack
[ RUN      ] Unstack.shape_4_3_axis_0
[       OK ] Unstack.shape_4_3_axis_0 (15 ms)
[ RUN      ] Unstack.shape_4_3_axis_1
[       OK ] Unstack.shape_4_3_axis_1 (10 ms)
[----------] 2 tests from Unstack (26 ms total)

[----------] 1 test from LayoutInference
[ RUN      ] LayoutInference.simple_conv2d
[       OK ] LayoutInference.simple_conv2d (23 ms)
[----------] 1 test from LayoutInference (24 ms total)

[----------] Global test environment tear-down
[==========] 175 tests from 59 test suites ran. (20865 ms total)
[  PASSED  ] 175 tests.

Here is the output when running test_vsi_pytorch_model_all.py with the quantized tflite model InceptionNetV1.

x86_64 Host

#productname=VSI SIMULATOR, pid=0x88
1. press any key and continue...
vsi_npu.py --> qnn.dequantize
vsi_npu.py --> nn.softmax
vsi_npu.py --> qnn.quantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.avg_pool2d
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> reshape
This is important----> name_node.value() == tvmgen_default_vsi_npu_0
GraphMakerImpl::Create
graph gpuCount=1 interConnectRingCount=0
NN ring buffer is disabled
TensorMakerImpl::InferCall: vsi_npu.qnn_softmax
TensorMakerImpl::InferCall: reshape
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_avgpool2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
graph gpuCount=1 interConnectRingCount=0
NN ring buffer is disabled
W [HandleLayoutInfer:268]Op 162: default layout inference pass.
---------------------------Begin VerifyTiling -------------------------
AXI-SRAM = 1048320 Bytes VIP-SRAM = 522240 Bytes SWTILING_PHASE_FEATURES[0, 0, 0]
  0 TP [(   3  224  224 1,   150528, 0x0x327d810(0x0x327d810, 0x(nil)) ->  224  224    3 1,   150528, 0x0x3390940(0x0x3390940, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(1 1, 1 1)] C[  1]
  1 TP [( 224  224    3 1,   150528, 0x0x3390940(0x0x3390940, 0x(nil)) ->  115  115   12 1,   158700, 0x0x722ad10(0x0x722ad10, 0x(nil))) k(0 0    0, 0) pad(2 2) pool(1 1, 1 1)] P[  0] C[  2]
  2 NN [( 115  115   12 1,   158700, 0x0x722ad10(0x0x722ad10, 0x(nil)) ->  112  112   64 1,   802816, 0x0x33920c0(0x0x33920c0, 0x(nil))) k(4 4   12, 13440) pad(0 0) pool(1 1, 1 1)] P[  1] C[  3]
  3 TP [( 112  112   64 1,   802816, 0x0x33920c0(0x0x33920c0, 0x(nil)) ->   56   56   64 1,   200704, 0x0x3395390(0x0x3395390, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(3 3, 2 2)] P[  2] C[  4]
  4 NN [(  56   56   64 1,   200704, 0x0x3395390(0x0x3395390, 0x(nil)) ->   56   56   64 1,   200704, 0x0x3397ec0(0x0x3397ec0, 0x(nil))) k(1 1   64, 4608) pad(0 0) pool(1 1, 1 1)] P[  3] C[  5]
  5 NN [(  56   56   64 1,   200704, 0x0x3397ec0(0x0x3397ec0, 0x(nil)) ->   56   56  192 1,   602112, 0x0x339b7e0(0x0x339b7e0, 0x(nil))) k(3 3   64, 116992) pad(1 1) pool(1 1, 1 1)] P[  4] C[  6]
  6 TP [(  56   56  192 1,   602112, 0x0x339b7e0(0x0x339b7e0, 0x(nil)) ->   28   28  192 1,   150528, 0x0x339f100(0x0x339f100, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(3 3, 2 2)] P[  5] C[  7,  9, 10, 12]
  7 TP [(  28   28  192 1,   150528, 0x0x339f100(0x0x339f100, 0x(nil)) ->   28   28  192 1,   150528, 0x0x33ac730(0x0x33ac730, 0x(nil))) k(0 0    0, 0) pad(1 1) pool(3 3, 1 1)] P[  6] C[  8]
  8 NN [(  28   28  192 1,   150528, 0x0x33ac730(0x0x33ac730, 0x(nil)) ->   28   28   32 1,    25088, 0x0x33b6550(0x0x33b6550, 0x(nil))) k(1 1  192, 6656) pad(0 0) pool(1 1, 1 1)] P[  7] C[ 17]
  9 NN [(  28   28  192 1,   150528, 0x0x339f100(0x0x339f100, 0x(nil)) ->   28   28   64 1,    50176, 0x0x33a1b70(0x0x33a1b70, 0x(nil))) k(1 1  192, 13184) pad(0 0) pool(1 1, 1 1)] P[  6] C[ 14]
 10 NN [(  28   28  192 1,   150528, 0x0x339f100(0x0x339f100, 0x(nil)) ->   28   28   96 1,    75264, 0x0x33a54b0(0x0x33a54b0, 0x(nil))) k(1 1  192, 19840) pad(0 0) pool(1 1, 1 1)] P[  6] C[ 11]
 11 NN [(  28   28   96 1,    75264, 0x0x33a54b0(0x0x33a54b0, 0x(nil)) ->   28   28  128 1,   100352, 0x0x33af180(0x0x33af180, 0x(nil))) k(3 3   96, 116736) pad(1 1) pool(1 1, 1 1)] P[ 10] C[ 15]
 12 NN [(  28   28  192 1,   150528, 0x0x339f100(0x0x339f100, 0x(nil)) ->   28   28   16 1,    12544, 0x0x33a8df0(0x0x33a8df0, 0x(nil))) k(1 1  192, 3328) pad(0 0) pool(1 1, 1 1)] P[  6] C[ 13]
 13 NN [(  28   28   16 1,    12544, 0x0x33a8df0(0x0x33a8df0, 0x(nil)) ->   28   28   32 1,    25088, 0x0x33b2ae0(0x0x33b2ae0, 0x(nil))) k(3 3   16, 4992) pad(1 1) pool(1 1, 1 1)] P[ 12] C[ 16]
 14 TP [(  28   28   64 1,    50176, 0x0x33a1b70(0x0x33a1b70, 0x(nil)) ->   28   28   64 1,   200704, 0x0x33b9fb0(0x0x33b9fb0, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[  9] C[ 18, 20, 21, 23]
 15 TP [(  28   28  128 1,   100352, 0x0x33af180(0x0x33af180, 0x(nil)) ->   28   28  128 1,   200704, 0x0x8359fb0(0x0x33b9fb0, 0x0xc400)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 11] C[ 18, 20, 21, 23]
 16 TP [(  28   28   32 1,    25088, 0x0x33b2ae0(0x0x33b2ae0, 0x(nil)) ->   28   28   32 1,   200704, 0x0x12299fb0(0x0x33b9fb0, 0x0x24c00)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 13] C[ 18, 20, 21, 23]
 17 TP [(  28   28   32 1,    25088, 0x0x33b6550(0x0x33b6550, 0x(nil)) ->   28   28   32 1,   200704, 0x0x14a69fb0(0x0x33b9fb0, 0x0x2ae00)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[  8] C[ 18, 20, 21, 23]
 18 TP [(  28   28  256 1,   200704, 0x0x33b9fb0(0x0x33b9fb0, 0x(nil)) ->   28   28  256 1,   200704, 0x0x33d6130(0x0x33d6130, 0x(nil))) k(0 0    0, 0) pad(1 1) pool(3 3, 1 1)] P[ 14, 15, 16, 17] C[ 19]
 19 NN [(  28   28  256 1,   200704, 0x0x33d6130(0x0x33d6130, 0x(nil)) ->   28   28   64 1,    50176, 0x0x33e01a0(0x0x33e01a0, 0x(nil))) k(1 1  256, 17536) pad(0 0) pool(1 1, 1 1)] P[ 18] C[ 28]
 20 NN [(  28   28  256 1,   200704, 0x0x33b9fb0(0x0x33b9fb0, 0x(nil)) ->   28   28  128 1,   100352, 0x0x33cb160(0x0x33cb160, 0x(nil))) k(1 1  256, 34944) pad(0 0) pool(1 1, 1 1)] P[ 14, 15, 16, 17] C[ 25]
 21 NN [(  28   28  256 1,   200704, 0x0x33b9fb0(0x0x33b9fb0, 0x(nil)) ->   28   28  128 1,   100352, 0x0x33cebf0(0x0x33cebf0, 0x(nil))) k(1 1  256, 34944) pad(0 0) pool(1 1, 1 1)] P[ 14, 15, 16, 17] C[ 22]
 22 NN [(  28   28  128 1,   100352, 0x0x33cebf0(0x0x33cebf0, 0x(nil)) ->   28   28  192 1,   150528, 0x0x33d8c60(0x0x33d8c60, 0x(nil))) k(3 3  128, 233088) pad(1 1) pool(1 1, 1 1)] P[ 21] C[ 26]
 23 NN [(  28   28  256 1,   200704, 0x0x33b9fb0(0x0x33b9fb0, 0x(nil)) ->   28   28   32 1,    25088, 0x0x33d2680(0x0x33d2680, 0x(nil))) k(1 1  256, 8832) pad(0 0) pool(1 1, 1 1)] P[ 14, 15, 16, 17] C[ 24]
 24 NN [(  28   28   32 1,    25088, 0x0x33d2680(0x0x33d2680, 0x(nil)) ->   28   28   96 1,    75264, 0x0x33dc6f0(0x0x33dc6f0, 0x(nil))) k(3 3   32, 29440) pad(1 1) pool(1 1, 1 1)] P[ 23] C[ 27]
 25 TP [(  28   28  128 1,   100352, 0x0x33cb160(0x0x33cb160, 0x(nil)) ->   28   28  128 1,   376320, 0x0x33e3c30(0x0x33e3c30, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 20] C[ 29]
 26 TP [(  28   28  192 1,   150528, 0x0x33d8c60(0x0x33d8c60, 0x(nil)) ->   28   28  192 1,   376320, 0x0xd323c30(0x0x33e3c30, 0x0x18800)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 22] C[ 29]
 27 TP [(  28   28   96 1,    75264, 0x0x33dc6f0(0x0x33dc6f0, 0x(nil)) ->   28   28   96 1,   376320, 0x0x1c203c30(0x0x33e3c30, 0x0x3d400)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 24] C[ 29]
 28 TP [(  28   28   64 1,    50176, 0x0x33e01a0(0x0x33e01a0, 0x(nil)) ->   28   28   64 1,   376320, 0x0x23973c30(0x0x33e3c30, 0x0x4fa00)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 19] C[ 29]
 29 TP [(  28   28  480 1,   376320, 0x0x33e3c30(0x0x33e3c30, 0x(nil)) ->   14   14  480 1,    94080, 0x0x33f4de0(0x0x33f4de0, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(3 3, 2 2)] P[ 25, 26, 27, 28] C[ 30, 32, 33, 35]
 30 TP [(  14   14  480 1,    94080, 0x0x33f4de0(0x0x33f4de0, 0x(nil)) ->   14   14  480 1,    94080, 0x0x3402b20(0x0x3402b20, 0x(nil))) k(0 0    0, 0) pad(1 1) pool(3 3, 1 1)] P[ 29] C[ 31]
 31 NN [(  14   14  480 1,    94080, 0x0x3402b20(0x0x3402b20, 0x(nil)) ->   14   14   64 1,    12544, 0x0x340d850(0x0x340d850, 0x(nil))) k(1 1  480, 32640) pad(0 0) pool(1 1, 1 1)] P[ 30] C[ 40]
 32 NN [(  14   14  480 1,    94080, 0x0x33f4de0(0x0x33f4de0, 0x(nil)) ->   14   14  192 1,    37632, 0x0x33f7b10(0x0x33f7b10, 0x(nil))) k(1 1  480, 97664) pad(0 0) pool(1 1, 1 1)] P[ 29] C[ 37]
 33 NN [(  14   14  480 1,    94080, 0x0x33f4de0(0x0x33f4de0, 0x(nil)) ->   14   14   96 1,    18816, 0x0x33fb5c0(0x0x33fb5c0, 0x(nil))) k(1 1  480, 48896) pad(0 0) pool(1 1, 1 1)] P[ 29] C[ 34]
 34 NN [(  14   14   96 1,    18816, 0x0x33fb5c0(0x0x33fb5c0, 0x(nil)) ->   14   14  208 1,    40768, 0x0x3405670(0x0x3405670, 0x(nil))) k(3 3   96, 189696) pad(1 1) pool(1 1, 1 1)] P[ 33] C[ 38]
 35 NN [(  14   14  480 1,    94080, 0x0x33f4de0(0x0x33f4de0, 0x(nil)) ->   14   14   16 1,     3136, 0x0x33ff070(0x0x33ff070, 0x(nil))) k(1 1  480, 8192) pad(0 0) pool(1 1, 1 1)] P[ 29] C[ 36]
 36 NN [(  14   14   16 1,     3136, 0x0x33ff070(0x0x33ff070, 0x(nil)) ->   14   14   48 1,     9408, 0x0x34098f0(0x0x34098f0, 0x(nil))) k(3 3   16, 7552) pad(1 1) pool(1 1, 1 1)] P[ 35] C[ 39]
 37 TP [(  14   14  192 1,    37632, 0x0x33f7b10(0x0x33f7b10, 0x(nil)) ->   14   14  192 1,   100352, 0x0x3411ec0(0x0x3411ec0, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 32] C[ 41, 43, 44, 46]
 38 TP [(  14   14  208 1,    40768, 0x0x3405670(0x0x3405670, 0x(nil)) ->   14   14  208 1,   100352, 0x0x6fc9ec0(0x0x3411ec0, 0x0x9300)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 34] C[ 41, 43, 44, 46]
 39 TP [(  14   14   48 1,     9408, 0x0x34098f0(0x0x34098f0, 0x(nil)) ->   14   14   48 1,   100352, 0x0xb07bec0(0x0x3411ec0, 0x0x13240)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 36] C[ 41, 43, 44, 46]
 40 TP [(  14   14   64 1,    12544, 0x0x340d850(0x0x340d850, 0x(nil)) ->   14   14   64 1,   100352, 0x0xbf69ec0(0x0x3411ec0, 0x0x15700)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 31] C[ 41, 43, 44, 46]
 41 TP [(  14   14  512 1,   100352, 0x0x3411ec0(0x0x3411ec0, 0x(nil)) ->   14   14  512 1,   100352, 0x0x342f090(0x0x342f090, 0x(nil))) k(0 0    0, 0) pad(1 1) pool(3 3, 1 1)] P[ 37, 38, 39, 40] C[ 42]
 42 NN [(  14   14  512 1,   100352, 0x0x342f090(0x0x342f090, 0x(nil)) ->   14   14   64 1,    12544, 0x0x343a0e0(0x0x343a0e0, 0x(nil))) k(1 1  512, 34688) pad(0 0) pool(1 1, 1 1)] P[ 41] C[ 51]
 43 NN [(  14   14  512 1,   100352, 0x0x3411ec0(0x0x3411ec0, 0x(nil)) ->   14   14  160 1,    31360, 0x0x3423090(0x0x3423090, 0x(nil))) k(1 1  512, 86784) pad(0 0) pool(1 1, 1 1)] P[ 37, 38, 39, 40] C[ 48]
 44 NN [(  14   14  512 1,   100352, 0x0x3411ec0(0x0x3411ec0, 0x(nil)) ->   14   14  112 1,    21952, 0x0x3427220(0x0x3427220, 0x(nil))) k(1 1  512, 60800) pad(0 0) pool(1 1, 1 1)] P[ 37, 38, 39, 40] C[ 45]
 45 NN [(  14   14  112 1,    21952, 0x0x3427220(0x0x3427220, 0x(nil)) ->   14   14  224 1,    43904, 0x0x3431bd0(0x0x3431bd0, 0x(nil))) k(3 3  112, 238080) pad(1 1) pool(1 1, 1 1)] P[ 44] C[ 49]
 46 NN [(  14   14  512 1,   100352, 0x0x3411ec0(0x0x3411ec0, 0x(nil)) ->   14   14   24 1,     4704, 0x0x342b180(0x0x342b180, 0x(nil))) k(1 1  512, 13056) pad(0 0) pool(1 1, 1 1)] P[ 37, 38, 39, 40] C[ 47]
 47 NN [(  14   14   24 1,     4704, 0x0x342b180(0x0x342b180, 0x(nil)) ->   14   14   64 1,    12544, 0x0x3436150(0x0x3436150, 0x(nil))) k(3 3   24, 14848) pad(1 1) pool(1 1, 1 1)] P[ 46] C[ 50]
 48 TP [(  14   14  160 1,    31360, 0x0x3423090(0x0x3423090, 0x(nil)) ->   14   14  160 1,   100352, 0x0x343e020(0x0x343e020, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 43] C[ 52, 54, 55, 57]
 49 TP [(  14   14  224 1,    43904, 0x0x3431bd0(0x0x3431bd0, 0x(nil)) ->   14   14  224 1,   100352, 0x0x6602020(0x0x343e020, 0x0x7a80)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 45] C[ 52, 54, 55, 57]
 50 TP [(  14   14   64 1,    12544, 0x0x3436150(0x0x3436150, 0x(nil)) ->   14   14   64 1,   100352, 0x0xabae020(0x0x343e020, 0x0x12600)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 47] C[ 52, 54, 55, 57]
 51 TP [(  14   14   64 1,    12544, 0x0x343a0e0(0x0x343a0e0, 0x(nil)) ->   14   14   64 1,   100352, 0x0xbf96020(0x0x343e020, 0x0x15700)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 42] C[ 52, 54, 55, 57]
 52 TP [(  14   14  512 1,   100352, 0x0x343e020(0x0x343e020, 0x(nil)) ->   14   14  512 1,   100352, 0x0x353d330(0x0x353d330, 0x(nil))) k(0 0    0, 0) pad(1 1) pool(3 3, 1 1)] P[ 48, 49, 50, 51] C[ 53]
 53 NN [(  14   14  512 1,   100352, 0x0x353d330(0x0x353d330, 0x(nil)) ->   14   14   64 1,    12544, 0x0x353fdc0(0x0x353fdc0, 0x(nil))) k(1 1  512, 34688) pad(0 0) pool(1 1, 1 1)] P[ 52] C[ 62]
 54 NN [(  14   14  512 1,   100352, 0x0x343e020(0x0x343e020, 0x(nil)) ->   14   14  128 1,    25088, 0x0x344f1d0(0x0x344f1d0, 0x(nil))) k(1 1  512, 69376) pad(0 0) pool(1 1, 1 1)] P[ 48, 49, 50, 51] C[ 59]
 55 NN [(  14   14  512 1,   100352, 0x0x343e020(0x0x343e020, 0x(nil)) ->   14   14  128 1,    25088, 0x0x3453360(0x0x3453360, 0x(nil))) k(1 1  512, 69376) pad(0 0) pool(1 1, 1 1)] P[ 48, 49, 50, 51] C[ 56]
 56 NN [(  14   14  128 1,    25088, 0x0x3453360(0x0x3453360, 0x(nil)) ->   14   14  256 1,    50176, 0x0x3539400(0x0x3539400, 0x(nil))) k(3 3  128, 310784) pad(1 1) pool(1 1, 1 1)] P[ 55] C[ 60]
 57 NN [(  14   14  512 1,   100352, 0x0x343e020(0x0x343e020, 0x(nil)) ->   14   14   24 1,     4704, 0x0x35354a0(0x0x35354a0, 0x(nil))) k(1 1  512, 13056) pad(0 0) pool(1 1, 1 1)] P[ 48, 49, 50, 51] C[ 58]
 58 NN [(  14   14   24 1,     4704, 0x0x35354a0(0x0x35354a0, 0x(nil)) ->   14   14   64 1,    12544, 0x0x35443d0(0x0x35443d0, 0x(nil))) k(3 3   24, 14848) pad(1 1) pool(1 1, 1 1)] P[ 57] C[ 61]
 59 TP [(  14   14  128 1,    25088, 0x0x344f1d0(0x0x344f1d0, 0x(nil)) ->   14   14  128 1,   100352, 0x0x3548340(0x0x3548340, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 54] C[ 63, 65, 66, 68]
 60 TP [(  14   14  256 1,    50176, 0x0x3539400(0x0x3539400, 0x(nil)) ->   14   14  256 1,   100352, 0x0x5d18340(0x0x3548340, 0x0x6200)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 56] C[ 63, 65, 66, 68]
 61 TP [(  14   14   64 1,    12544, 0x0x35443d0(0x0x35443d0, 0x(nil)) ->   14   14   64 1,   100352, 0x0xacb8340(0x0x3548340, 0x0x12600)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 58] C[ 63, 65, 66, 68]
 62 TP [(  14   14   64 1,    12544, 0x0x353fdc0(0x0x353fdc0, 0x(nil)) ->   14   14   64 1,   100352, 0x0xc0a0340(0x0x3548340, 0x0x15700)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 53] C[ 63, 65, 66, 68]
 63 TP [(  14   14  512 1,   100352, 0x0x3548340(0x0x3548340, 0x(nil)) ->   14   14  512 1,   100352, 0x0x3559410(0x0x3559410, 0x(nil))) k(0 0    0, 0) pad(1 1) pool(3 3, 1 1)] P[ 59, 60, 61, 62] C[ 64]
 64 NN [(  14   14  512 1,   100352, 0x0x3559410(0x0x3559410, 0x(nil)) ->   14   14   64 1,    12544, 0x0x3568670(0x0x3568670, 0x(nil))) k(1 1  512, 34688) pad(0 0) pool(1 1, 1 1)] P[ 63] C[ 73]
 65 NN [(  14   14  512 1,   100352, 0x0x3548340(0x0x3548340, 0x(nil)) ->   14   14  112 1,    21952, 0x0x355c190(0x0x355c190, 0x(nil))) k(1 1  512, 60800) pad(0 0) pool(1 1, 1 1)] P[ 59, 60, 61, 62] C[ 70]
 66 NN [(  14   14  512 1,   100352, 0x0x3548340(0x0x3548340, 0x(nil)) ->   14   14  144 1,    28224, 0x0x35607a0(0x0x35607a0, 0x(nil))) k(1 1  512, 78080) pad(0 0) pool(1 1, 1 1)] P[ 59, 60, 61, 62] C[ 67]
 67 NN [(  14   14  144 1,    28224, 0x0x35607a0(0x0x35607a0, 0x(nil)) ->   14   14  288 1,    56448, 0x0x356c5d0(0x0x356c5d0, 0x(nil))) k(3 3  144, 393216) pad(1 1) pool(1 1, 1 1)] P[ 66] C[ 71]
 68 NN [(  14   14  512 1,   100352, 0x0x3548340(0x0x3548340, 0x(nil)) ->   14   14   32 1,     6272, 0x0x3564710(0x0x3564710, 0x(nil))) k(1 1  512, 17408) pad(0 0) pool(1 1, 1 1)] P[ 59, 60, 61, 62] C[ 69]
 69 NN [(  14   14   32 1,     6272, 0x0x3564710(0x0x3564710, 0x(nil)) ->   14   14   64 1,    12544, 0x0x3570500(0x0x3570500, 0x(nil))) k(3 3   32, 19712) pad(1 1) pool(1 1, 1 1)] P[ 68] C[ 72]
 70 TP [(  14   14  112 1,    21952, 0x0x355c190(0x0x355c190, 0x(nil)) ->   14   14  112 1,   103488, 0x0x3574460(0x0x3574460, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 65] C[ 74, 76, 77, 79]
 71 TP [(  14   14  288 1,    56448, 0x0x356c5d0(0x0x356c5d0, 0x(nil)) ->   14   14  288 1,   103488, 0x0x584a460(0x0x3574460, 0x0x55c0)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 67] C[ 74, 76, 77, 79]
 72 TP [(  14   14   64 1,    12544, 0x0x3570500(0x0x3570500, 0x(nil)) ->   14   14   64 1,   103488, 0x0xb1de460(0x0x3574460, 0x0x13240)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 69] C[ 74, 76, 77, 79]
 73 TP [(  14   14   64 1,    12544, 0x0x3568670(0x0x3568670, 0x(nil)) ->   14   14   64 1,   103488, 0x0xc5c6460(0x0x3574460, 0x0x16340)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 64] C[ 74, 76, 77, 79]
 74 TP [(  14   14  528 1,   103488, 0x0x3574460(0x0x3574460, 0x(nil)) ->   14   14  528 1,   103488, 0x0x3585530(0x0x3585530, 0x(nil))) k(0 0    0, 0) pad(1 1) pool(3 3, 1 1)] P[ 70, 71, 72, 73] C[ 75]
 75 NN [(  14   14  528 1,   103488, 0x0x3585530(0x0x3585530, 0x(nil)) ->   14   14  128 1,    25088, 0x0x35947b0(0x0x35947b0, 0x(nil))) k(1 1  528, 71552) pad(0 0) pool(1 1, 1 1)] P[ 74] C[ 84]
 76 NN [(  14   14  528 1,   103488, 0x0x3574460(0x0x3574460, 0x(nil)) ->   14   14  256 1,    50176, 0x0x35882d0(0x0x35882d0, 0x(nil))) k(1 1  528, 143104) pad(0 0) pool(1 1, 1 1)] P[ 70, 71, 72, 73] C[ 81]
 77 NN [(  14   14  528 1,   103488, 0x0x3574460(0x0x3574460, 0x(nil)) ->   14   14  160 1,    31360, 0x0x358c8e0(0x0x358c8e0, 0x(nil))) k(1 1  528, 89472) pad(0 0) pool(1 1, 1 1)] P[ 70, 71, 72, 73] C[ 78]
 78 NN [(  14   14  160 1,    31360, 0x0x358c8e0(0x0x358c8e0, 0x(nil)) ->   14   14  320 1,    62720, 0x0x3598710(0x0x3598710, 0x(nil))) k(3 3  160, 485248) pad(1 1) pool(1 1, 1 1)] P[ 77] C[ 82]
 79 NN [(  14   14  528 1,   103488, 0x0x3574460(0x0x3574460, 0x(nil)) ->   14   14   32 1,     6272, 0x0x3590850(0x0x3590850, 0x(nil))) k(1 1  528, 17920) pad(0 0) pool(1 1, 1 1)] P[ 70, 71, 72, 73] C[ 80]
 80 NN [(  14   14   32 1,     6272, 0x0x3590850(0x0x3590850, 0x(nil)) ->   14   14  128 1,    25088, 0x0x359c640(0x0x359c640, 0x(nil))) k(3 3   32, 39296) pad(1 1) pool(1 1, 1 1)] P[ 79] C[ 83]
 81 TP [(  14   14  256 1,    50176, 0x0x35882d0(0x0x35882d0, 0x(nil)) ->   14   14  256 1,   163072, 0x0x35a05a0(0x0x35a05a0, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 76] C[ 85]
 82 TP [(  14   14  320 1,    62720, 0x0x3598710(0x0x3598710, 0x(nil)) ->   14   14  320 1,   163072, 0x0x85405a0(0x0x35a05a0, 0x0xc400)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 78] C[ 85]
 83 TP [(  14   14  128 1,    25088, 0x0x359c640(0x0x359c640, 0x(nil)) ->   14   14  128 1,   163072, 0x0xe8c85a0(0x0x35a05a0, 0x0x1b900)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 80] C[ 85]
 84 TP [(  14   14  128 1,    25088, 0x0x35947b0(0x0x35947b0, 0x(nil)) ->   14   14  128 1,   163072, 0x0x110985a0(0x0x35a05a0, 0x0x21b00)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 75] C[ 85]
 85 TP [(  14   14  832 1,   163072, 0x0x35a05a0(0x0x35a05a0, 0x(nil)) ->    7    7  832 1,    40768, 0x0x35b1670(0x0x35b1670, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(2 2, 2 2)] P[ 81, 82, 83, 84] C[ 86, 88, 89, 91]
 86 TP [(   7    7  832 1,    40768, 0x0x35b1670(0x0x35b1670, 0x(nil)) ->    7    7  832 1,    40768, 0x0x35b4410(0x0x35b4410, 0x(nil))) k(0 0    0, 0) pad(1 1) pool(3 3, 1 1)] P[ 85] C[ 87]
 87 NN [(   7    7  832 1,    40768, 0x0x35b4410(0x0x35b4410, 0x(nil)) ->    7    7  128 1,     6272, 0x0x35c3a20(0x0x35c3a20, 0x(nil))) k(1 1  832, 112384) pad(0 0) pool(1 1, 1 1)] P[ 86] C[ 96]
 88 NN [(   7    7  832 1,    40768, 0x0x35b1670(0x0x35b1670, 0x(nil)) ->    7    7  256 1,    12544, 0x0x35b7540(0x0x35b7540, 0x(nil))) k(1 1  832, 224768) pad(0 0) pool(1 1, 1 1)] P[ 85] C[ 93]
 89 NN [(   7    7  832 1,    40768, 0x0x35b1670(0x0x35b1670, 0x(nil)) ->    7    7  160 1,     7840, 0x0x35bbb50(0x0x35bbb50, 0x(nil))) k(1 1  832, 140544) pad(0 0) pool(1 1, 1 1)] P[ 85] C[ 90]
 90 NN [(   7    7  160 1,     7840, 0x0x35bbb50(0x0x35bbb50, 0x(nil)) ->    7    7  320 1,    15680, 0x0x35c7980(0x0x35c7980, 0x(nil))) k(3 3  160, 485248) pad(1 1) pool(1 1, 1 1)] P[ 89] C[ 94]
 91 NN [(   7    7  832 1,    40768, 0x0x35b1670(0x0x35b1670, 0x(nil)) ->    7    7   32 1,     1568, 0x0x35bfac0(0x0x35bfac0, 0x(nil))) k(1 1  832, 28160) pad(0 0) pool(1 1, 1 1)] P[ 85] C[ 92]
 92 NN [(   7    7   32 1,     1568, 0x0x35bfac0(0x0x35bfac0, 0x(nil)) ->    7    7  128 1,     6272, 0x0x35cb8b0(0x0x35cb8b0, 0x(nil))) k(3 3   32, 39296) pad(1 1) pool(1 1, 1 1)] P[ 91] C[ 95]
 93 TP [(   7    7  256 1,    12544, 0x0x35b7540(0x0x35b7540, 0x(nil)) ->    7    7  256 1,    40768, 0x0x35cf810(0x0x35cf810, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 88] C[ 97, 99,100,102]
 94 TP [(   7    7  320 1,    15680, 0x0x35c7980(0x0x35c7980, 0x(nil)) ->    7    7  320 1,    40768, 0x0x49b7810(0x0x35cf810, 0x0x3100)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 90] C[ 97, 99,100,102]
 95 TP [(   7    7  128 1,     6272, 0x0x35cb8b0(0x0x35cb8b0, 0x(nil)) ->    7    7  128 1,    40768, 0x0x6299810(0x0x35cf810, 0x0x6e40)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 92] C[ 97, 99,100,102]
 96 TP [(   7    7  128 1,     6272, 0x0x35c3a20(0x0x35c3a20, 0x(nil)) ->    7    7  128 1,    40768, 0x0x6c8d810(0x0x35cf810, 0x0x86c0)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 87] C[ 97, 99,100,102]
 97 TP [(   7    7  832 1,    40768, 0x0x35cf810(0x0x35cf810, 0x(nil)) ->    7    7  832 1,    40768, 0x0x35e08e0(0x0x35e08e0, 0x(nil))) k(0 0    0, 0) pad(1 1) pool(3 3, 1 1)] P[ 93, 94, 95, 96] C[ 98]
 98 NN [(   7    7  832 1,    40768, 0x0x35e08e0(0x0x35e08e0, 0x(nil)) ->    7    7  128 1,     6272, 0x0x35efb60(0x0x35efb60, 0x(nil))) k(1 1  832, 112384) pad(0 0) pool(1 1, 1 1)] P[ 97] C[107]
 99 NN [(   7    7  832 1,    40768, 0x0x35cf810(0x0x35cf810, 0x(nil)) ->    7    7  384 1,    18816, 0x0x35e3680(0x0x35e3680, 0x(nil))) k(1 1  832, 337152) pad(0 0) pool(1 1, 1 1)] P[ 93, 94, 95, 96] C[104]
100 NN [(   7    7  832 1,    40768, 0x0x35cf810(0x0x35cf810, 0x(nil)) ->    7    7  192 1,     9408, 0x0x35e7c90(0x0x35e7c90, 0x(nil))) k(1 1  832, 168576) pad(0 0) pool(1 1, 1 1)] P[ 93, 94, 95, 96] C[101]
101 NN [(   7    7  192 1,     9408, 0x0x35e7c90(0x0x35e7c90, 0x(nil)) ->    7    7  384 1,    18816, 0x0x35f3ac0(0x0x35f3ac0, 0x(nil))) k(3 3  192, 698368) pad(1 1) pool(1 1, 1 1)] P[100] C[105]
102 NN [(   7    7  832 1,    40768, 0x0x35cf810(0x0x35cf810, 0x(nil)) ->    7    7   48 1,     2352, 0x0x35ebc00(0x0x35ebc00, 0x(nil))) k(1 1  832, 42240) pad(0 0) pool(1 1, 1 1)] P[ 93, 94, 95, 96] C[103]
103 NN [(   7    7   48 1,     2352, 0x0x35ebc00(0x0x35ebc00, 0x(nil)) ->    7    7  128 1,     6272, 0x0x35f79f0(0x0x35f79f0, 0x(nil))) k(3 3   48, 58624) pad(1 1) pool(1 1, 1 1)] P[102] C[106]
104 TP [(   7    7  384 1,    18816, 0x0x35e3680(0x0x35e3680, 0x(nil)) ->    7    7  384 1,    50176, 0x0x35fb950(0x0x35fb950, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 99] C[108]
105 TP [(   7    7  384 1,    18816, 0x0x35f3ac0(0x0x35f3ac0, 0x(nil)) ->    7    7  384 1,    50176, 0x0x53d7950(0x0x35fb950, 0x0x4980)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[101] C[108]
106 TP [(   7    7  128 1,     6272, 0x0x35f79f0(0x0x35f79f0, 0x(nil)) ->    7    7  128 1,    50176, 0x0x71b3950(0x0x35fb950, 0x0x9300)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[103] C[108]
107 TP [(   7    7  128 1,     6272, 0x0x35efb60(0x0x35efb60, 0x(nil)) ->    7    7  128 1,    50176, 0x0x7ba7950(0x0x35fb950, 0x0xab80)) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 98] C[108]
108 SH [(   7    7 1024 1,    50176, 0x0x35fb950(0x0x35fb950, 0x(nil)) ->    1    1 1024 1,     1024, 0x0x360ca20(0x0x360ca20, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[104,105,106,107] C[109]
109 TP [(1024    1    1 1,     1024, 0x0x360ca20(0x0x360ca20, 0x(nil)) -> 1001    1    1 1,     1001, 0x0x338fa70(0x0x338fa70, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(1 1, 1 1)] P[108] C[110]
110 SH [(1001    1    1 1,     1001, 0x0x338fa70(0x0x338fa70, 0x(nil)) -> 1001    1    1 1,     1001, 0x0x338ee30(0x0x338ee30, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[109]

 id IN [ x  y  w   h ]   OUT  [ x  y  w  h ] (tx, ty, kpc) (ic, kc, kc/ks, ks/eks, kernel_type) NNT(in, out)

 id | opid IN [ x  y  w   h ]   OUT  [ x  y  w  h ] (tx, ty, kpc) (ic, kc, kc/ks, ks/eks, kernel_type) NNT(in, out)
  0 |   0 TP DD 0x0 [   0    0        3      224] -> DD 0x0 [   0    0      224      224] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
  1 |   1 TP DD 0x0 [   0    0      224      224] -> DD 0x0 [   0    0      115      115] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
  2 |   2 NN DD 0x0 [   0    0      115      115] -> DD 0x0 [   0    0      112      112] ( 56,   8,   8) (    7936,    11776, 100.00%, 87.62%, DD) (       0,        0)
  3 |   3 TP DD 0x0 [   0    0      112      112] -> DD 0x0 [   0    0       56       56] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
  4 |   4 NN DD 0x0 [   0    0       56       56] -> DD 0x0 [   0    0       56       56] ( 56,   8,   8) (   28672,     4608, 100.00%, 100.00%, DD) (       0,        0)
  5 |   5 NN DD 0x0 [   0    0       56       56] -> DD 0x0 [   0    0       56       56] ( 56,   8,   8) (   37888,   112640, 100.00%, 96.28%, DD) (       0,        0)
  6 |   6 TP DD 0x0 [   0    0       56       56] -> DD 0x0 [   0    0       28       28] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
  7 |   7 TP DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
  8 |   8 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   4) (   76800,     6656, 100.00%, 100.00%, DD) (       0,        0)
  9 |   9 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   8) (   76800,    13312, 100.00%, 100.97%, DD) (       0,        0)
 10 |  10 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  10,  12) (   55296,    19456, 100.00%, 98.06%, DD) (       0,        0)
 11 |  11 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  16,   8) (   52224,   112128, 100.00%, 96.05%, DD) (       0,        0)
 12 |  12 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   2) (   76800,     3584, 100.00%, 107.69%, DD) (       0,        0)
 13 |  13 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   4) (    7680,     5120, 100.00%, 102.56%, DD) (       0,        0)
 14 |  14 TP DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 15 |  15 TP DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 16 |  16 TP DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 17 |  17 TP DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 18 |  18 TP DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 19 |  19 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   8) (  102400,    17408, 100.00%, 99.27%, DD) (       0,        0)
 20 |  20 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   8) (  102400,    34304, 100.00%, 98.17%, DD) (       0,        0)
 21 |  21 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   8) (  102400,    34304, 100.00%, 98.17%, DD) (       0,        0)
 22 |  22 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  20,   6) (   86016,   223232, 100.00%, 95.77%, DD) (       0,        0)
 23 |  23 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   4) (  102400,     8704, 100.00%, 98.55%, DD) (       0,        0)
 24 |  24 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  10,  12) (   11776,    28672, 100.00%, 97.39%, DD) (       0,        0)
 25 |  25 TP DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 26 |  26 TP DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 27 |  27 TP DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 28 |  28 TP DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 29 |  29 TP DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 30 |  30 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 31 |  31 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   8) (   99840,    31744, 100.00%, 97.25%, DD) (       0,        0)
 32 |  32 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  12) (   99840,    94208, 100.00%, 96.46%, DD) (       0,        0)
 33 |  33 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  12) (   99840,    47104, 100.00%, 96.34%, DD) (       0,        0)
 34 |  34 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  13) (   24576,   181760, 100.00%, 95.82%, DD) (       0,        0)
 35 |  35 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   2) (   99840,     8192, 100.00%, 100.00%, DD) (       0,        0)
 36 |  36 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   6) (    4096,     7680, 100.00%, 101.69%, DD) (       0,        0)
 37 |  37 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 38 |  38 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 39 |  39 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 40 |  40 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 41 |  41 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 42 |  42 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   8) (  106496,    33792, 100.00%, 97.42%, DD) (       0,        0)
 43 |  43 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  10) (  106496,    83456, 100.00%, 96.17%, DD) (       0,        0)
 44 |  44 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  14) (  106496,    58368, 100.00%, 96.00%, DD) (       0,        0)
 45 |  45 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  14) (   28672,   227840, 100.00%, 95.70%, DD) (       0,        0)
 46 |  46 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   3) (  106496,    12800, 100.00%, 98.04%, DD) (       0,        0)
 47 |  47 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   8) (    6144,    14848, 100.00%, 100.00%, DD) (       0,        0)
 48 |  48 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 49 |  49 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 50 |  50 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 51 |  51 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 52 |  52 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 53 |  53 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   8) (  106496,    33792, 100.00%, 97.42%, DD) (       0,        0)
 54 |  54 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (  106496,    67072, 100.00%, 96.68%, DD) (       0,        0)
 55 |  55 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (  106496,    67072, 100.00%, 96.68%, DD) (       0,        0)
 56 |  56 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (   32768,   297472, 100.00%, 95.72%, DD) (       0,        0)
 57 |  57 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   3) (  106496,    12800, 100.00%, 98.04%, DD) (       0,        0)
 58 |  58 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   8) (    6144,    14848, 100.00%, 100.00%, DD) (       0,        0)
 59 |  59 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 60 |  60 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 61 |  61 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 62 |  62 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 63 |  63 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 64 |  64 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   8) (  106496,    33792, 100.00%, 97.42%, DD) (       0,        0)
 65 |  65 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  14) (  106496,    58368, 100.00%, 96.00%, DD) (       0,        0)
 66 |  66 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  18) (  106496,    75264, 100.00%, 96.39%, DD) (       0,        0)
 67 |  67 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  18) (   36864,   375808, 100.00%, 95.57%, DD) (       0,        0)
 68 |  68 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   4) (  106496,    16896, 100.00%, 97.06%, DD) (       0,        0)
 69 |  69 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   8) (    8192,    19456, 100.00%, 98.70%, DD) (       0,        0)
 70 |  70 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 71 |  71 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 72 |  72 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 73 |  73 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 74 |  74 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 75 |  75 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (  109824,    69120, 100.00%, 96.60%, DD) (       0,        0)
 76 |  76 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (  109824,   137728, 100.00%, 96.24%, DD) (       0,        0)
 77 |  77 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  10) (  109824,    86016, 100.00%, 96.14%, DD) (       0,        0)
 78 |  78 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  14) (   40960,   463872, 100.00%, 95.59%, DD) (       0,        0)
 79 |  79 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   4) (  109824,    17408, 100.00%, 97.14%, DD) (       0,        0)
 80 |  80 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (    8192,    38400, 100.00%, 97.72%, DD) (       0,        0)
 81 |  81 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 82 |  82 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 83 |  83 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 84 |  84 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 85 |  85 TP DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0        7        7] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 86 |  86 TP DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 87 |  87 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   4,  16) (   26624,   108032, 100.00%, 96.13%, DD) (       0,        0)
 88 |  88 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  32) (   53248,   215552, 100.00%, 95.90%, DD) (       0,        0)
 89 |  89 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  20) (   53248,   134656, 100.00%, 95.81%, DD) (       0,        0)
 90 |  90 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  20) (   15360,   463872, 100.00%, 95.59%, DD) (       0,        0)
 91 |  91 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   4,   4) (   26624,    27136, 100.00%, 96.36%, DD) (       0,        0)
 92 |  92 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  16) (    3072,    38400, 100.00%, 97.72%, DD) (       0,        0)
 93 |  93 TP DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 94 |  94 TP DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 95 |  95 TP DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 96 |  96 TP DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 97 |  97 TP DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 98 |  98 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   4,  16) (   26624,   108032, 100.00%, 96.13%, DD) (       0,        0)
 99 |  99 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  24) (   53248,   323072, 100.00%, 95.82%, DD) (       0,        0)
100 | 100 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  24) (   53248,   161792, 100.00%, 95.98%, DD) (       0,        0)
101 | 101 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  24) (   18432,   503808, 75.52%, 95.53%, DD) (       0,        0)
102 | 102 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   4,   6) (   26624,    40448, 100.00%, 95.76%, DD) (       0,        0)
103 | 103 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  16) (    4608,    56832, 100.00%, 96.94%, DD) (       0,        0)
104 | 104 TP DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
105 | 105 TP DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
106 | 106 TP DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
107 | 107 TP DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
108 | 108 SH DD 0x0 [   0    0        0        0] -> DD 0x0 [   0    0        0        0] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
109 | 109 TP DD 0x0 [   0    0     1024        1] -> DD 0x0 [   0    0     1001        1] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
110 | 110 SH DD 0x0 [   0    0        0        0] -> DD 0x0 [   0    0        0        0] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)

PreLoadWeightBiases = 1048320  100.000000%
---------------------------End VerifyTiling -------------------------
KernelStreamSize: 0x66b540, statesSize: 0x4bc0, shShareMemSize: 0x0, shIntrSize: 0x700, shParaSize: 0x440, swParaSize: 0x0, lcdTensorSize: 0x0, shaderStatesSize: 0x9c0, tensorStatic: 0x0
NBG: operationSize: 0x1d50, nnSize: 0x1d00, tpSize: 0x9a80, shSize: 0x10, swSize: 0x0, layerParamSize: 0x0, lcdtSize: 0xa78, patchSize: 0xb478, icdtSize: 0xe8 hwInitOpSize: 0x24, lcdSize 0x670d40
NBG: entranceSize: 0x208, nbIOSize: 0xe8, layeSize: 0x18a4, sectionsSize: 0x194dc, inputoutput size: 0x24fe9, InitCommands size: 0x1104
NBG: lcdSize: 0x670d40, headerSize : 0x1b070
Calculate NBG size : 6868916 bytes
generate NBG into memory start.
vxoBinaryGraph_SaveBinaryEntrance[20461]: collect input count=0, output count=0
vxoBinaryGraph_SaveBinaryEntrance[20531]: total operation count=111
generate NBG, device count=1, core count per-device: 1,
vxoBinaryGraph_RefineInputOutput:11143 input table address: 0x183c500
vxoBinaryGraph_RefineInputOutput:11149 output table address: 0x1ec0100
vxoBinaryGraph_SaveBinaryEntranceExt[19524]: graph->inputCount=1, graph->outputCount=1, refine inputCount=1, outputCount=1
NBG network name field : dummy_network_name
vxoBinaryGraph_SaveBinaryEntranceExt[20127]: header input count=1, output count=1
generate NBG, save initialize commands
vxoBinaryGraph_ReSaveInputAndPatchTable[17202]: re-save operation count=265
Generate NBG in memory Actual NBG size : 6862080 bytes
generate NBG into memory successfully.
Releasing object array 0x33ba640
Releasing object array 0x33e42c0
Releasing object array 0x3412550
Releasing object array 0x343e6b0
Releasing object array 0x35489d0
Releasing object array 0x3574af0
Releasing object array 0x35a0c30
Releasing object array 0x35cfea0
Releasing object array 0x35fbfe0
VsiNpuModule::GetFunction: get_symbol
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: get_const_vars
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: get_const_vars
VsiNpuModule::GetFunction: return early

source_filename = "empty_module"
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-linux-gnu"

VsiNpuModule::SaveToBinary
SaveToBinary: nbg size = 6862080
SaveToBinary: input size = 1
SaveToBinary: output size = 1
VsiNpuModule : SerializeTensorSpec
VsiNpuModule : SerializeTensorSpec2
VsiNpuModule : SerializeTensorSpec
VsiNpuModule : SerializeTensorSpec2
VsiNpuModule::SaveToBinary2
['aarch64-linux-gnu-g++', '-shared', '-fPIC', '-o', 'lib.so', '/tmp/tmp0u18x4nw/lib0.o', '/tmp/tmp0u18x4nw/devc.o', '-L/home/niuniu/gcc-linaro-7.3.1-2018.05-x86_64_aarch64-linux-gnu/bin'] ============
[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]

Khadas VIM3 pro

python3 -m tvm.exec.rpc_server --host 0.0.0.0 --port=9090
INFO:root:If you are running ROCM/Metal, fork will cause compiler internal error. Try to launch with arg ```--no-fork```
INFO:RPCServer:bind to 0.0.0.0:9090
INFO:RPCServer:connection from ('192.168.137.177', 41982)
VsiNpuModule::LoadFromBinary
LoadFromBinary: nbg size = 6862080
LoadFromBinary: input size = 1
LoadFromBinary: output size = 1
VsiNpuModule : DeSerializeTensorSpec
VsiNpuModule : DeSerializeTensorSpec2
VsiNpuModule : DeSerializeTensorSpec
VsiNpuModule : DeSerializeTensorSpec2
INFO:RPCServer:load_module /tmp/tmp2vjrwe64/lib.so
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: tvmgen_default_vsi_npu_0
[     1] PLS isn't existed
Process Graph: 61439 ms or 61439858 us
VsiNpuModule::GetFunction: size: 2
INFO:RPCServer:Finish serving ('192.168.137.177', 41982)

@sunshinemyson
Copy link
Contributor

Thanks for report.

You can not get result for other tflite models because it is hang on device side. Could you share an example model which cannot get result?

@fumao13579
Copy link
Author

fumao13579 commented May 23, 2023

@sunshinemyson
https://drive.google.com/drive/folders/1IYTqiebzVf_M0oUAjzG7BJ__iiC1P-Qm?usp=share_link

When these three TFLite models are executed with TVM RPC, they produce empty output. Similarly, when I copy the model_lib.so generated by tvm's "lib.export_library(lib_path)" to Khadas VIM3 Pro and run the TVM runtime, I still encounter empty output. However, when I perform local compilation of these three TFLite models using TVM on Khadas VIM3 Pro, I can obtain normal output.

This is the output of local compilation on Khadas VIM3 Pro, using the quantized tflite model InceptionNetV1.

Details

khadas@Khadas:~/$ python3 test_vsi_tflite_model_all.py 
#[version = "0.0.5"]
def @main(%input: Tensor[(1, 224, 224, 3), uint8], %v_param_1: Tensor[(7, 7, 3, 64), uint8], %v_param_2: Tensor[(64), int32], %v_param_3: Tensor[(1, 1, 64, 64), uint8], %v_param_4: Tensor[(64), int32], %v_param_5: Tensor[(3, 3, 64, 192), uint8], %v_param_6: Tensor[(192), int32], %v_param_7: Tensor[(1, 1, 192, 64), uint8], %v_param_8: Tensor[(64), int32], %v_param_9: Tensor[(1, 1, 192, 96), uint8], %v_param_10: Tensor[(96), int32], %v_param_11: Tensor[(3, 3, 96, 128), uint8], %v_param_12: Tensor[(128), int32], %v_param_13: Tensor[(1, 1, 192, 16), uint8], %v_param_14: Tensor[(16), int32], %v_param_15: Tensor[(3, 3, 16, 32), uint8], %v_param_16: Tensor[(32), int32], %v_param_17: Tensor[(1, 1, 192, 32), uint8], %v_param_18: Tensor[(32), int32], %v_param_19: Tensor[(1, 1, 256, 128), uint8], %v_param_20: Tensor[(128), int32], %v_param_21: Tensor[(1, 1, 256, 128), uint8], %v_param_22: Tensor[(128), int32], %v_param_23: Tensor[(3, 3, 128, 192), uint8], %v_param_24: Tensor[(192), int32], %v_param_25: Tensor[(1, 1, 256, 32), uint8], %v_param_26: Tensor[(32), int32], %v_param_27: Tensor[(3, 3, 32, 96), uint8], %v_param_28: Tensor[(96), int32], %v_param_29: Tensor[(1, 1, 256, 64), uint8], %v_param_30: Tensor[(64), int32], %v_param_31: Tensor[(1, 1, 480, 192), uint8], %v_param_32: Tensor[(192), int32], %v_param_33: Tensor[(1, 1, 480, 96), uint8], %v_param_34: Tensor[(96), int32], %v_param_35: Tensor[(3, 3, 96, 208), uint8], %v_param_36: Tensor[(208), int32], %v_param_37: Tensor[(1, 1, 480, 16), uint8], %v_param_38: Tensor[(16), int32], %v_param_39: Tensor[(3, 3, 16, 48), uint8], %v_param_40: Tensor[(48), int32], %v_param_41: Tensor[(1, 1, 480, 64), uint8], %v_param_42: Tensor[(64), int32], %v_param_43: Tensor[(1, 1, 512, 160), uint8], %v_param_44: Tensor[(160), int32], %v_param_45: Tensor[(1, 1, 512, 112), uint8], %v_param_46: Tensor[(112), int32], %v_param_47: Tensor[(3, 3, 112, 224), uint8], %v_param_48: Tensor[(224), int32], %v_param_49: Tensor[(1, 1, 512, 24), uint8], %v_param_50: Tensor[(24), int32], %v_param_51: Tensor[(3, 3, 24, 64), uint8], %v_param_52: Tensor[(64), int32], %v_param_53: Tensor[(1, 1, 512, 64), uint8], %v_param_54: Tensor[(64), int32], %v_param_55: Tensor[(1, 1, 512, 128), uint8], %v_param_56: Tensor[(128), int32], %v_param_57: Tensor[(1, 1, 512, 128), uint8], %v_param_58: Tensor[(128), int32], %v_param_59: Tensor[(3, 3, 128, 256), uint8], %v_param_60: Tensor[(256), int32], %v_param_61: Tensor[(1, 1, 512, 24), uint8], %v_param_62: Tensor[(24), int32], %v_param_63: Tensor[(3, 3, 24, 64), uint8], %v_param_64: Tensor[(64), int32], %v_param_65: Tensor[(1, 1, 512, 64), uint8], %v_param_66: Tensor[(64), int32], %v_param_67: Tensor[(1, 1, 512, 112), uint8], %v_param_68: Tensor[(112), int32], %v_param_69: Tensor[(1, 1, 512, 144), uint8], %v_param_70: Tensor[(144), int32], %v_param_71: Tensor[(3, 3, 144, 288), uint8], %v_param_72: Tensor[(288), int32], %v_param_73: Tensor[(1, 1, 512, 32), uint8], %v_param_74: Tensor[(32), int32], %v_param_75: Tensor[(3, 3, 32, 64), uint8], %v_param_76: Tensor[(64), int32], %v_param_77: Tensor[(1, 1, 512, 64), uint8], %v_param_78: Tensor[(64), int32], %v_param_79: Tensor[(1, 1, 528, 256), uint8], %v_param_80: Tensor[(256), int32], %v_param_81: Tensor[(1, 1, 528, 160), uint8], %v_param_82: Tensor[(160), int32], %v_param_83: Tensor[(3, 3, 160, 320), uint8], %v_param_84: Tensor[(320), int32], %v_param_85: Tensor[(1, 1, 528, 32), uint8], %v_param_86: Tensor[(32), int32], %v_param_87: Tensor[(3, 3, 32, 128), uint8], %v_param_88: Tensor[(128), int32], %v_param_89: Tensor[(1, 1, 528, 128), uint8], %v_param_90: Tensor[(128), int32], %v_param_91: Tensor[(1, 1, 832, 256), uint8], %v_param_92: Tensor[(256), int32], %v_param_93: Tensor[(1, 1, 832, 160), uint8], %v_param_94: Tensor[(160), int32], %v_param_95: Tensor[(3, 3, 160, 320), uint8], %v_param_96: Tensor[(320), int32], %v_param_97: Tensor[(1, 1, 832, 32), uint8], %v_param_98: Tensor[(32), int32], %v_param_99: Tensor[(3, 3, 32, 128), uint8], %v_param_100: Tensor[(128), int32], %v_param_101: Tensor[(1, 1, 832, 128), uint8], %v_param_102: Tensor[(128), int32], %v_param_103: Tensor[(1, 1, 832, 384), uint8], %v_param_104: Tensor[(384), int32], %v_param_105: Tensor[(1, 1, 832, 192), uint8], %v_param_106: Tensor[(192), int32], %v_param_107: Tensor[(3, 3, 192, 384), uint8], %v_param_108: Tensor[(384), int32], %v_param_109: Tensor[(1, 1, 832, 48), uint8], %v_param_110: Tensor[(48), int32], %v_param_111: Tensor[(3, 3, 48, 128), uint8], %v_param_112: Tensor[(128), int32], %v_param_113: Tensor[(1, 1, 832, 128), uint8], %v_param_114: Tensor[(128), int32], %v_param_115: Tensor[(1, 1, 1024, 1001), uint8], %v_param_116: Tensor[(1001), int32]) {
  %0 = qnn.conv2d(%input, %v_param_1, 128, 141, 0.0078125f, 0.0243229f, strides=[2, 2], padding=[2, 2, 3, 3], channels=64, kernel_size=[7, 7], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %1 = nn.bias_add(%0, %v_param_2, axis=3);
  %2 = qnn.requantize(%1, 0.000190023f, 0, 0.107703f, 0, axis=3, out_dtype="uint8");
  %3 = nn.max_pool2d(%2, pool_size=[3, 3], strides=[2, 2], padding=[0, 0, 1, 1], layout="NHWC");
  %4 = qnn.conv2d(%3, %v_param_3, 0, 134, 0.107703f, 0.0171319f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %5 = nn.bias_add(%4, %v_param_4, axis=3);
  %6 = qnn.requantize(%5, 0.00184516f, 0, 0.053206f, 0, axis=3, out_dtype="uint8");
  %7 = qnn.conv2d(%6, %v_param_5, 0, 137, 0.053206f, 0.00701139f, padding=[1, 1, 1, 1], channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %8 = nn.bias_add(%7, %v_param_6, axis=3);
  %9 = qnn.requantize(%8, 0.000373048f, 0, 0.044983f, 0, axis=3, out_dtype="uint8");
  %10 = nn.max_pool2d(%9, pool_size=[3, 3], strides=[2, 2], padding=[0, 0, 1, 1], layout="NHWC");
  %11 = qnn.conv2d(%10, %v_param_7, 0, 106, 0.044983f, 0.00639617f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %12 = nn.bias_add(%11, %v_param_8, axis=3);
  %13 = qnn.conv2d(%10, %v_param_9, 0, 174, 0.044983f, 0.0074075f, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %14 = nn.bias_add(%13, %v_param_10, axis=3);
  %15 = qnn.requantize(%14, 0.000333212f, 0, 0.0381216f, 0, axis=3, out_dtype="uint8");
  %16 = qnn.conv2d(%15, %v_param_11, 0, 97, 0.0381216f, 0.00448481f, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %17 = nn.bias_add(%16, %v_param_12, axis=3);
  %18 = qnn.conv2d(%10, %v_param_13, 0, 90, 0.044983f, 0.00434916f, padding=[0, 0, 0, 0], channels=16, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %19 = nn.bias_add(%18, %v_param_14, axis=3);
  %20 = qnn.requantize(%19, 0.000195639f, 0, 0.0304856f, 0, axis=3, out_dtype="uint8");
  %21 = qnn.conv2d(%20, %v_param_15, 0, 77, 0.0304856f, 0.0113698f, padding=[1, 1, 1, 1], channels=32, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %22 = nn.bias_add(%21, %v_param_16, axis=3);
  %23 = nn.max_pool2d(%10, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC");
  %24 = qnn.conv2d(%23, %v_param_17, 0, 149, 0.044983f, 0.00737061f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %25 = nn.bias_add(%24, %v_param_18, axis=3);
  %26 = qnn.requantize(%12, 0.000287719f, 0, 0.0475482f, 0, axis=3, out_dtype="uint8");
  %27 = qnn.requantize(%17, 0.000170968f, 0, 0.034202f, 0, axis=3, out_dtype="uint8");
  %28 = qnn.requantize(%22, 0.000346614f, 0, 0.0420845f, 0, axis=3, out_dtype="uint8");
  %29 = qnn.requantize(%25, 0.000331553f, 0, 0.02516f, 0, axis=3, out_dtype="uint8");
  %30 = (%26, %27, %28, %29);
  %31 = (0.0475482f, 0.034202f, 0.0420845f, 0.02516f);
  %32 = (0, 0, 0, 0);
  %33 = qnn.concatenate(%30, %31, %32, 0.0475482f, 0, axis=3);
  %34 = qnn.conv2d(%33, %v_param_19, 0, 135, 0.0475482f, 0.0064377f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %35 = nn.bias_add(%34, %v_param_20, axis=3);
  %36 = qnn.conv2d(%33, %v_param_21, 0, 133, 0.0475482f, 0.00539997f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %37 = nn.bias_add(%36, %v_param_22, axis=3);
  %38 = qnn.requantize(%37, 0.000256759f, 0, 0.0317389f, 0, axis=3, out_dtype="uint8");
  %39 = qnn.conv2d(%38, %v_param_23, 0, 94, 0.0317389f, 0.00359896f, padding=[1, 1, 1, 1], channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %40 = nn.bias_add(%39, %v_param_24, axis=3);
  %41 = qnn.conv2d(%33, %v_param_25, 0, 129, 0.0475482f, 0.00531897f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %42 = nn.bias_add(%41, %v_param_26, axis=3);
  %43 = qnn.requantize(%42, 0.000252907f, 0, 0.034475f, 0, axis=3, out_dtype="uint8");
  %44 = qnn.conv2d(%43, %v_param_27, 0, 121, 0.034475f, 0.00415084f, padding=[1, 1, 1, 1], channels=96, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %45 = nn.bias_add(%44, %v_param_28, axis=3);
  %46 = nn.max_pool2d(%33, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC");
  %47 = qnn.conv2d(%46, %v_param_29, 0, 129, 0.0475482f, 0.00529972f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %48 = nn.bias_add(%47, %v_param_30, axis=3);
  %49 = qnn.requantize(%35, 0.000306101f, 0, 0.034585f, 0, axis=3, out_dtype="uint8");
  %50 = qnn.requantize(%40, 0.000114227f, 0, 0.0316799f, 0, axis=3, out_dtype="uint8");
  %51 = qnn.requantize(%45, 0.0001431f, 0, 0.0277635f, 0, axis=3, out_dtype="uint8");
  %52 = qnn.requantize(%48, 0.000251992f, 0, 0.0281896f, 0, axis=3, out_dtype="uint8");
  %53 = (%49, %50, %51, %52);
  %54 = (0.034585f, 0.0316799f, 0.0277635f, 0.0281896f);
  %55 = (0, 0, 0, 0);
  %56 = qnn.concatenate(%53, %54, %55, 0.034585f, 0, axis=3);
  %57 = nn.max_pool2d(%56, pool_size=[3, 3], strides=[2, 2], padding=[0, 0, 1, 1], layout="NHWC");
  %58 = qnn.conv2d(%57, %v_param_31, 0, 104, 0.034585f, 0.00488506f, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %59 = nn.bias_add(%58, %v_param_32, axis=3);
  %60 = qnn.conv2d(%57, %v_param_33, 0, 69, 0.034585f, 0.00521668f, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %61 = nn.bias_add(%60, %v_param_34, axis=3);
  %62 = qnn.requantize(%61, 0.000180419f, 0, 0.0407384f, 0, axis=3, out_dtype="uint8");
  %63 = qnn.conv2d(%62, %v_param_35, 0, 80, 0.0407384f, 0.00412294f, padding=[1, 1, 1, 1], channels=208, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %64 = nn.bias_add(%63, %v_param_36, axis=3);
  %65 = qnn.conv2d(%57, %v_param_37, 0, 159, 0.034585f, 0.00324746f, padding=[0, 0, 0, 0], channels=16, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %66 = nn.bias_add(%65, %v_param_38, axis=3);
  %67 = qnn.requantize(%66, 0.000112313f, 0, 0.029503f, 0, axis=3, out_dtype="uint8");
  %68 = qnn.conv2d(%67, %v_param_39, 0, 88, 0.029503f, 0.00959363f, padding=[1, 1, 1, 1], channels=48, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %69 = nn.bias_add(%68, %v_param_40, axis=3);
  %70 = nn.max_pool2d(%57, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC");
  %71 = qnn.conv2d(%70, %v_param_41, 0, 123, 0.034585f, 0.0063726f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %72 = nn.bias_add(%71, %v_param_42, axis=3);
  %73 = qnn.requantize(%59, 0.00016895f, 0, 0.0350619f, 0, axis=3, out_dtype="uint8");
  %74 = qnn.requantize(%64, 0.000167962f, 0, 0.038577f, 0, axis=3, out_dtype="uint8");
  %75 = qnn.requantize(%69, 0.000283041f, 0, 0.0261499f, 0, axis=3, out_dtype="uint8");
  %76 = qnn.requantize(%72, 0.000220396f, 0, 0.0227659f, 0, axis=3, out_dtype="uint8");
  %77 = (%73, %74, %75, %76);
  %78 = (0.0350619f, 0.038577f, 0.0261499f, 0.0227659f);
  %79 = (0, 0, 0, 0);
  %80 = qnn.concatenate(%77, %78, %79, 0.038577f, 0, axis=3);
  %81 = qnn.conv2d(%80, %v_param_43, 0, 131, 0.038577f, 0.00565282f, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %82 = nn.bias_add(%81, %v_param_44, axis=3);
  %83 = qnn.conv2d(%80, %v_param_45, 0, 111, 0.038577f, 0.00606403f, padding=[0, 0, 0, 0], channels=112, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %84 = nn.bias_add(%83, %v_param_46, axis=3);
  %85 = qnn.requantize(%84, 0.000233932f, 0, 0.0390984f, 0, axis=3, out_dtype="uint8");
  %86 = qnn.conv2d(%85, %v_param_47, 0, 77, 0.0390984f, 0.00476621f, padding=[1, 1, 1, 1], channels=224, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %87 = nn.bias_add(%86, %v_param_48, axis=3);
  %88 = qnn.conv2d(%80, %v_param_49, 0, 127, 0.038577f, 0.00466451f, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %89 = nn.bias_add(%88, %v_param_50, axis=3);
  %90 = qnn.requantize(%89, 0.000179943f, 0, 0.0326719f, 0, axis=3, out_dtype="uint8");
  %91 = qnn.conv2d(%90, %v_param_51, 0, 105, 0.0326719f, 0.00475245f, padding=[1, 1, 1, 1], channels=64, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %92 = nn.bias_add(%91, %v_param_52, axis=3);
  %93 = nn.max_pool2d(%80, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC");
  %94 = qnn.conv2d(%93, %v_param_53, 0, 128, 0.038577f, 0.00292699f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %95 = nn.bias_add(%94, %v_param_54, axis=3);
  %96 = qnn.requantize(%82, 0.000218069f, 0, 0.0384053f, 0, axis=3, out_dtype="uint8");
  %97 = qnn.requantize(%87, 0.000186351f, 0, 0.0415277f, 0, axis=3, out_dtype="uint8");
  %98 = qnn.requantize(%92, 0.000155272f, 0, 0.0353133f, 0, axis=3, out_dtype="uint8");
  %99 = qnn.requantize(%95, 0.000112914f, 0, 0.0217496f, 0, axis=3, out_dtype="uint8");
  %100 = (%96, %97, %98, %99);
  %101 = (0.0384053f, 0.0415277f, 0.0353133f, 0.0217496f);
  %102 = (0, 0, 0, 0);
  %103 = qnn.concatenate(%100, %101, %102, 0.0415277f, 0, axis=3);
  %104 = qnn.conv2d(%103, %v_param_55, 0, 143, 0.0415277f, 0.00513341f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %105 = nn.bias_add(%104, %v_param_56, axis=3);
  %106 = qnn.conv2d(%103, %v_param_57, 0, 125, 0.0415277f, 0.0056437f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %107 = nn.bias_add(%106, %v_param_58, axis=3);
  %108 = qnn.requantize(%107, 0.00023437f, 0, 0.0444829f, 0, axis=3, out_dtype="uint8");
  %109 = qnn.conv2d(%108, %v_param_59, 0, 104, 0.0444829f, 0.00298305f, padding=[1, 1, 1, 1], channels=256, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %110 = nn.bias_add(%109, %v_param_60, axis=3);
  %111 = qnn.conv2d(%103, %v_param_61, 0, 96, 0.0415277f, 0.00617409f, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %112 = nn.bias_add(%111, %v_param_62, axis=3);
  %113 = qnn.requantize(%112, 0.000256396f, 0, 0.0382293f, 0, axis=3, out_dtype="uint8");
  %114 = qnn.conv2d(%113, %v_param_63, 0, 90, 0.0382293f, 0.00926049f, padding=[1, 1, 1, 1], channels=64, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %115 = nn.bias_add(%114, %v_param_64, axis=3);
  %116 = nn.max_pool2d(%103, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC");
  %117 = qnn.conv2d(%116, %v_param_65, 0, 133, 0.0415277f, 0.00348826f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %118 = nn.bias_add(%117, %v_param_66, axis=3);
  %119 = qnn.requantize(%105, 0.000213179f, 0, 0.0363159f, 0, axis=3, out_dtype="uint8");
  %120 = qnn.requantize(%110, 0.000132695f, 0, 0.040194f, 0, axis=3, out_dtype="uint8");
  %121 = qnn.requantize(%115, 0.000354022f, 0, 0.0679776f, 0, axis=3, out_dtype="uint8");
  %122 = qnn.requantize(%118, 0.00014486f, 0, 0.0225817f, 0, axis=3, out_dtype="uint8");
  %123 = (%119, %120, %121, %122);
  %124 = (0.0363159f, 0.040194f, 0.0679776f, 0.0225817f);
  %125 = (0, 0, 0, 0);
  %126 = qnn.concatenate(%123, %124, %125, 0.0679776f, 0, axis=3);
  %127 = qnn.conv2d(%126, %v_param_67, 0, 131, 0.0679776f, 0.00541721f, padding=[0, 0, 0, 0], channels=112, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %128 = nn.bias_add(%127, %v_param_68, axis=3);
  %129 = qnn.conv2d(%126, %v_param_69, 0, 102, 0.0679776f, 0.00529131f, padding=[0, 0, 0, 0], channels=144, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %130 = nn.bias_add(%129, %v_param_70, axis=3);
  %131 = qnn.requantize(%130, 0.000359691f, 0, 0.0464631f, 0, axis=3, out_dtype="uint8");
  %132 = qnn.conv2d(%131, %v_param_71, 0, 121, 0.0464631f, 0.00281512f, padding=[1, 1, 1, 1], channels=288, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %133 = nn.bias_add(%132, %v_param_72, axis=3);
  %134 = qnn.conv2d(%126, %v_param_73, 0, 129, 0.0679776f, 0.00454161f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %135 = nn.bias_add(%134, %v_param_74, axis=3);
  %136 = qnn.requantize(%135, 0.000308728f, 0, 0.0439514f, 0, axis=3, out_dtype="uint8");
  %137 = qnn.conv2d(%136, %v_param_75, 0, 92, 0.0439514f, 0.00496321f, padding=[1, 1, 1, 1], channels=64, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %138 = nn.bias_add(%137, %v_param_76, axis=3);
  %139 = nn.max_pool2d(%126, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC");
  %140 = qnn.conv2d(%139, %v_param_77, 0, 124, 0.0679776f, 0.00317437f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %141 = nn.bias_add(%140, %v_param_78, axis=3);
  %142 = qnn.requantize(%128, 0.000368249f, 0, 0.0520244f, 0, axis=3, out_dtype="uint8");
  %143 = qnn.requantize(%133, 0.000130799f, 0, 0.0511231f, 0, axis=3, out_dtype="uint8");
  %144 = qnn.requantize(%138, 0.00021814f, 0, 0.0310861f, 0, axis=3, out_dtype="uint8");
  %145 = qnn.requantize(%141, 0.000215786f, 0, 0.024479f, 0, axis=3, out_dtype="uint8");
  %146 = (%142, %143, %144, %145);
  %147 = (0.0520244f, 0.0511231f, 0.0310861f, 0.024479f);
  %148 = (0, 0, 0, 0);
  %149 = qnn.concatenate(%146, %147, %148, 0.0520244f, 0, axis=3);
  %150 = qnn.conv2d(%149, %v_param_79, 0, 118, 0.0520244f, 0.00557758f, padding=[0, 0, 0, 0], channels=256, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %151 = nn.bias_add(%150, %v_param_80, axis=3);
  %152 = qnn.conv2d(%149, %v_param_81, 0, 105, 0.0520244f, 0.00543337f, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %153 = nn.bias_add(%152, %v_param_82, axis=3);
  %154 = qnn.requantize(%153, 0.000282668f, 0, 0.0368424f, 0, axis=3, out_dtype="uint8");
  %155 = qnn.conv2d(%154, %v_param_83, 0, 85, 0.0368424f, 0.00295774f, padding=[1, 1, 1, 1], channels=320, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %156 = nn.bias_add(%155, %v_param_84, axis=3);
  %157 = qnn.conv2d(%149, %v_param_85, 0, 126, 0.0520244f, 0.00506661f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %158 = nn.bias_add(%157, %v_param_86, axis=3);
  %159 = qnn.requantize(%158, 0.000263587f, 0, 0.0576595f, 0, axis=3, out_dtype="uint8");
  %160 = qnn.conv2d(%159, %v_param_87, 0, 81, 0.0576595f, 0.00359061f, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %161 = nn.bias_add(%160, %v_param_88, axis=3);
  %162 = nn.max_pool2d(%149, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC");
  %163 = qnn.conv2d(%162, %v_param_89, 0, 94, 0.0520244f, 0.00317797f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %164 = nn.bias_add(%163, %v_param_90, axis=3);
  %165 = qnn.requantize(%151, 0.00029017f, 0, 0.0461338f, 0, axis=3, out_dtype="uint8");
  %166 = qnn.requantize(%156, 0.00010897f, 0, 0.0384801f, 0, axis=3, out_dtype="uint8");
  %167 = qnn.requantize(%161, 0.000207033f, 0, 0.0713473f, 0, axis=3, out_dtype="uint8");
  %168 = qnn.requantize(%164, 0.000165332f, 0, 0.0265916f, 0, axis=3, out_dtype="uint8");
  %169 = (%165, %166, %167, %168);
  %170 = (0.0461338f, 0.0384801f, 0.0713473f, 0.0265916f);
  %171 = (0, 0, 0, 0);
  %172 = qnn.concatenate(%169, %170, %171, 0.0713473f, 0, axis=3);
  %173 = nn.max_pool2d(%172, pool_size=[2, 2], strides=[2, 2], padding=[0, 0, 0, 0], layout="NHWC");
  %174 = qnn.conv2d(%173, %v_param_91, 0, 182, 0.0713473f, 0.0104061f, padding=[0, 0, 0, 0], channels=256, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %175 = nn.bias_add(%174, %v_param_92, axis=3);
  %176 = qnn.conv2d(%173, %v_param_93, 0, 115, 0.0713473f, 0.00596868f, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %177 = nn.bias_add(%176, %v_param_94, axis=3);
  %178 = qnn.requantize(%177, 0.000425849f, 0, 0.0490709f, 0, axis=3, out_dtype="uint8");
  %179 = qnn.conv2d(%178, %v_param_95, 0, 129, 0.0490709f, 0.00293286f, padding=[1, 1, 1, 1], channels=320, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %180 = nn.bias_add(%179, %v_param_96, axis=3);
  %181 = qnn.conv2d(%173, %v_param_97, 0, 122, 0.0713473f, 0.00383815f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %182 = nn.bias_add(%181, %v_param_98, axis=3);
  %183 = qnn.requantize(%182, 0.000273842f, 0, 0.0411565f, 0, axis=3, out_dtype="uint8");
  %184 = qnn.conv2d(%183, %v_param_99, 0, 102, 0.0411565f, 0.002763f, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %185 = nn.bias_add(%184, %v_param_100, axis=3);
  %186 = nn.max_pool2d(%173, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC");
  %187 = qnn.conv2d(%186, %v_param_101, 0, 123, 0.0713473f, 0.00247852f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %188 = nn.bias_add(%187, %v_param_102, axis=3);
  %189 = qnn.requantize(%175, 0.000742446f, 0, 0.036752f, 0, axis=3, out_dtype="uint8");
  %190 = qnn.requantize(%180, 0.000143918f, 0, 0.0450282f, 0, axis=3, out_dtype="uint8");
  %191 = qnn.requantize(%185, 0.000113715f, 0, 0.0371453f, 0, axis=3, out_dtype="uint8");
  %192 = qnn.requantize(%188, 0.000176836f, 0, 0.0213327f, 0, axis=3, out_dtype="uint8");
  %193 = (%189, %190, %191, %192);
  %194 = (0.036752f, 0.0450282f, 0.0371453f, 0.0213327f);
  %195 = (0, 0, 0, 0);
  %196 = qnn.concatenate(%193, %194, %195, 0.0450282f, 0, axis=3);
  %197 = qnn.conv2d(%196, %v_param_103, 0, 104, 0.0450282f, 0.0143784f, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %198 = nn.bias_add(%197, %v_param_104, axis=3);
  %199 = qnn.conv2d(%196, %v_param_105, 0, 81, 0.0450282f, 0.00580293f, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %200 = nn.bias_add(%199, %v_param_106, axis=3);
  %201 = qnn.requantize(%200, 0.000261295f, 0, 0.0443907f, 0, axis=3, out_dtype="uint8");
  %202 = qnn.conv2d(%201, %v_param_107, 0, 83, 0.0443907f, 0.00505402f, padding=[1, 1, 1, 1], channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %203 = nn.bias_add(%202, %v_param_108, axis=3);
  %204 = qnn.conv2d(%196, %v_param_109, 0, 87, 0.0450282f, 0.00578726f, padding=[0, 0, 0, 0], channels=48, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %205 = nn.bias_add(%204, %v_param_110, axis=3);
  %206 = qnn.requantize(%205, 0.00026059f, 0, 0.0431175f, 0, axis=3, out_dtype="uint8");
  %207 = qnn.conv2d(%206, %v_param_111, 0, 74, 0.0431175f, 0.00680263f, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %208 = nn.bias_add(%207, %v_param_112, axis=3);
  %209 = nn.max_pool2d(%196, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC");
  %210 = qnn.conv2d(%209, %v_param_113, 0, 62, 0.0450282f, 0.0055094f, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %211 = nn.bias_add(%210, %v_param_114, axis=3);
  %212 = qnn.requantize(%198, 0.000647432f, 0, 0.0470831f, 0, axis=3, out_dtype="uint8");
  %213 = qnn.requantize(%203, 0.000224351f, 0, 0.0483342f, 0, axis=3, out_dtype="uint8");
  %214 = qnn.requantize(%208, 0.000293312f, 0, 0.0535589f, 0, axis=3, out_dtype="uint8");
  %215 = qnn.requantize(%211, 0.000248078f, 0, 0.0320987f, 0, axis=3, out_dtype="uint8");
  %216 = (%212, %213, %214, %215);
  %217 = (0.0470831f, 0.0483342f, 0.0535589f, 0.0320987f);
  %218 = (0, 0, 0, 0);
  %219 = qnn.concatenate(%216, %217, %218, 0.0535589f, 0, axis=3);
  %220 = cast(%219, dtype="int32");
  %221 = nn.avg_pool2d(%220, pool_size=[7, 7], padding=[0, 0, 0, 0], layout="NHWC");
  %222 = cast(%221, dtype="uint8");
  %223 = qnn.conv2d(%222, %v_param_115, 0, 106, 0.0535589f, 0.00235748f, padding=[0, 0, 0, 0], channels=1001, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %224 = nn.bias_add(%223, %v_param_116, axis=3);
  %225 = qnn.requantize(%224, 0.000126264f, 0, 0.0962827f, 60, axis=3, out_dtype="uint8");
  %226 = reshape(%225, newshape=[-1, 1001]);
  %227 = qnn.dequantize(%226, 0.0962827f, 60);
  %228 = nn.softmax(%227, axis=1);
  qnn.quantize(%228, 0.00390625f, 0, out_dtype="uint8")
}

vsi_npu.py --> qnn.dequantize
vsi_npu.py --> nn.softmax
vsi_npu.py --> qnn.quantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.avg_pool2d
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.max_pool2d
vsi_npu.py --> qnn.concatenate
vsi_npu.py --> reshape
def @main(%input: Tensor[(1, 224, 224, 3), uint8]) -> Tensor[(1, 1001), uint8] {
  @tvmgen_default_vsi_npu_0(%input) /* ty=Tensor[(1, 1001), uint8] */
}

def @tvmgen_default_vsi_npu_0(%vsi_npu_0_i0: Tensor[(1, 224, 224, 3), uint8], Inline=1, Compiler="vsi_npu", global_symbol="tvmgen_default_vsi_npu_0", Primitive=1) -> Tensor[(1, 1001), uint8] {
  %30 = fn (%FunctionVar_57_0: Tensor[(1, 224, 224, 3), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 112, 112, 64), uint8] {
    %28 = qnn.conv2d(%FunctionVar_57_0, meta[relay.Constant][24] /* ty=Tensor[(7, 7, 3, 64), uint8] */, 128 /* ty=int32 */, 141 /* ty=int32 */, 0.0078125f /* ty=float32 */, 0.0243229f /* ty=float32 */, strides=[2, 2], padding=[2, 2, 3, 3], channels=64, kernel_size=[7, 7], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 112, 112, 64), int32] */;
    %29 = nn.bias_add(%28, meta[relay.Constant][25] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 112, 112, 64), int32] */;
    qnn.requantize(%29, 0.000190023f /* ty=float32 */, 0 /* ty=int32 */, 0.107703f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 112, 112, 64), uint8] */
  };
  %31 = %30(%vsi_npu_0_i0) /* ty=Tensor[(1, 112, 112, 64), uint8] */;
  %32 = nn.max_pool2d(%31, pool_size=[3, 3], strides=[2, 2], padding=[0, 0, 1, 1], layout="NHWC") /* ty=Tensor[(1, 56, 56, 64), uint8] */;
  %33 = fn (%FunctionVar_56_0: Tensor[(1, 56, 56, 64), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 64), uint8] {
    %26 = qnn.conv2d(%FunctionVar_56_0, meta[relay.Constant][22] /* ty=Tensor[(1, 1, 64, 64), uint8] */, 0 /* ty=int32 */, 134 /* ty=int32 */, 0.107703f /* ty=float32 */, 0.0171319f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 64), int32] */;
    %27 = nn.bias_add(%26, meta[relay.Constant][23] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 64), int32] */;
    qnn.requantize(%27, 0.00184516f /* ty=float32 */, 0 /* ty=int32 */, 0.053206f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 64), uint8] */
  };
  %34 = %33(%32) /* ty=Tensor[(1, 56, 56, 64), uint8] */;
  %35 = fn (%FunctionVar_55_0: Tensor[(1, 56, 56, 64), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 192), uint8] {
    %24 = qnn.conv2d(%FunctionVar_55_0, meta[relay.Constant][20] /* ty=Tensor[(3, 3, 64, 192), uint8] */, 0 /* ty=int32 */, 137 /* ty=int32 */, 0.053206f /* ty=float32 */, 0.00701139f /* ty=float32 */, padding=[1, 1, 1, 1], channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 192), int32] */;
    %25 = nn.bias_add(%24, meta[relay.Constant][21] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 192), int32] */;
    qnn.requantize(%25, 0.000373048f /* ty=float32 */, 0 /* ty=int32 */, 0.044983f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 192), uint8] */
  };
  %36 = %35(%34) /* ty=Tensor[(1, 56, 56, 192), uint8] */;
  %37 = nn.max_pool2d(%36, pool_size=[3, 3], strides=[2, 2], padding=[0, 0, 1, 1], layout="NHWC") /* ty=Tensor[(1, 28, 28, 192), uint8] */;
  %38 = fn (%FunctionVar_54_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 64), uint8] {
    %22 = qnn.conv2d(%FunctionVar_54_0, meta[relay.Constant][18] /* ty=Tensor[(1, 1, 192, 64), uint8] */, 0 /* ty=int32 */, 106 /* ty=int32 */, 0.044983f /* ty=float32 */, 0.00639617f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 64), int32] */;
    %23 = nn.bias_add(%22, meta[relay.Constant][19] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 64), int32] */;
    qnn.requantize(%23, 0.000287719f /* ty=float32 */, 0 /* ty=int32 */, 0.0475482f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 64), uint8] */
  };
  %43 = fn (%FunctionVar_53_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 96), uint8] {
    %41 = qnn.conv2d(%FunctionVar_53_0, meta[relay.Constant][28] /* ty=Tensor[(1, 1, 192, 96), uint8] */, 0 /* ty=int32 */, 174 /* ty=int32 */, 0.044983f /* ty=float32 */, 0.0074075f /* ty=float32 */, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 96), int32] */;
    %42 = nn.bias_add(%41, meta[relay.Constant][29] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 96), int32] */;
    qnn.requantize(%42, 0.000333212f /* ty=float32 */, 0 /* ty=int32 */, 0.0381216f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 96), uint8] */
  };
  %44 = %43(%37) /* ty=Tensor[(1, 28, 28, 96), uint8] */;
  %45 = fn (%FunctionVar_52_0: Tensor[(1, 28, 28, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 128), uint8] {
    %39 = qnn.conv2d(%FunctionVar_52_0, meta[relay.Constant][26] /* ty=Tensor[(3, 3, 96, 128), uint8] */, 0 /* ty=int32 */, 97 /* ty=int32 */, 0.0381216f /* ty=float32 */, 0.00448481f /* ty=float32 */, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 128), int32] */;
    %40 = nn.bias_add(%39, meta[relay.Constant][27] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 128), int32] */;
    qnn.requantize(%40, 0.000170968f /* ty=float32 */, 0 /* ty=int32 */, 0.034202f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 128), uint8] */
  };
  %50 = fn (%FunctionVar_51_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 16), uint8] {
    %48 = qnn.conv2d(%FunctionVar_51_0, meta[relay.Constant][32] /* ty=Tensor[(1, 1, 192, 16), uint8] */, 0 /* ty=int32 */, 90 /* ty=int32 */, 0.044983f /* ty=float32 */, 0.00434916f /* ty=float32 */, padding=[0, 0, 0, 0], channels=16, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 16), int32] */;
    %49 = nn.bias_add(%48, meta[relay.Constant][33] /* ty=Tensor[(16), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 16), int32] */;
    qnn.requantize(%49, 0.000195639f /* ty=float32 */, 0 /* ty=int32 */, 0.0304856f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 16), uint8] */
  };
  %51 = %50(%37) /* ty=Tensor[(1, 28, 28, 16), uint8] */;
  %52 = fn (%FunctionVar_50_0: Tensor[(1, 28, 28, 16), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 32), uint8] {
    %46 = qnn.conv2d(%FunctionVar_50_0, meta[relay.Constant][30] /* ty=Tensor[(3, 3, 16, 32), uint8] */, 0 /* ty=int32 */, 77 /* ty=int32 */, 0.0304856f /* ty=float32 */, 0.0113698f /* ty=float32 */, padding=[1, 1, 1, 1], channels=32, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 32), int32] */;
    %47 = nn.bias_add(%46, meta[relay.Constant][31] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 32), int32] */;
    qnn.requantize(%47, 0.000346614f /* ty=float32 */, 0 /* ty=int32 */, 0.0420845f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 32), uint8] */
  };
  %55 = nn.max_pool2d(%37, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 28, 28, 192), uint8] */;
  %56 = fn (%FunctionVar_49_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 32), uint8] {
    %53 = qnn.conv2d(%FunctionVar_49_0, meta[relay.Constant][34] /* ty=Tensor[(1, 1, 192, 32), uint8] */, 0 /* ty=int32 */, 149 /* ty=int32 */, 0.044983f /* ty=float32 */, 0.00737061f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 32), int32] */;
    %54 = nn.bias_add(%53, meta[relay.Constant][35] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 32), int32] */;
    qnn.requantize(%54, 0.000331553f /* ty=float32 */, 0 /* ty=int32 */, 0.02516f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 32), uint8] */
  };
  %57 = %38(%37) /* ty=Tensor[(1, 28, 28, 64), uint8] */;
  %58 = %45(%44) /* ty=Tensor[(1, 28, 28, 128), uint8] */;
  %59 = %52(%51) /* ty=Tensor[(1, 28, 28, 32), uint8] */;
  %60 = %56(%55) /* ty=Tensor[(1, 28, 28, 32), uint8] */;
  %61 = (%57, %58, %59, %60);
  %62 = (0.0475482f /* ty=float32 */, 0.034202f /* ty=float32 */, 0.0420845f /* ty=float32 */, 0.02516f /* ty=float32 */);
  %63 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */);
  %64 = qnn.concatenate(%61, %62, %63, 0.0475482f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 28, 28, 256), uint8] */;
  %65 = fn (%FunctionVar_48_0: Tensor[(1, 28, 28, 256), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 128), uint8] {
    %20 = qnn.conv2d(%FunctionVar_48_0, meta[relay.Constant][16] /* ty=Tensor[(1, 1, 256, 128), uint8] */, 0 /* ty=int32 */, 135 /* ty=int32 */, 0.0475482f /* ty=float32 */, 0.0064377f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 128), int32] */;
    %21 = nn.bias_add(%20, meta[relay.Constant][17] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 128), int32] */;
    qnn.requantize(%21, 0.000306101f /* ty=float32 */, 0 /* ty=int32 */, 0.034585f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 128), uint8] */
  };
  %70 = fn (%FunctionVar_47_0: Tensor[(1, 28, 28, 256), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 128), uint8] {
    %68 = qnn.conv2d(%FunctionVar_47_0, meta[relay.Constant][38] /* ty=Tensor[(1, 1, 256, 128), uint8] */, 0 /* ty=int32 */, 133 /* ty=int32 */, 0.0475482f /* ty=float32 */, 0.00539997f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 128), int32] */;
    %69 = nn.bias_add(%68, meta[relay.Constant][39] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 128), int32] */;
    qnn.requantize(%69, 0.000256759f /* ty=float32 */, 0 /* ty=int32 */, 0.0317389f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 128), uint8] */
  };
  %71 = %70(%64) /* ty=Tensor[(1, 28, 28, 128), uint8] */;
  %72 = fn (%FunctionVar_46_0: Tensor[(1, 28, 28, 128), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 192), uint8] {
    %66 = qnn.conv2d(%FunctionVar_46_0, meta[relay.Constant][36] /* ty=Tensor[(3, 3, 128, 192), uint8] */, 0 /* ty=int32 */, 94 /* ty=int32 */, 0.0317389f /* ty=float32 */, 0.00359896f /* ty=float32 */, padding=[1, 1, 1, 1], channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 192), int32] */;
    %67 = nn.bias_add(%66, meta[relay.Constant][37] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 192), int32] */;
    qnn.requantize(%67, 0.000114227f /* ty=float32 */, 0 /* ty=int32 */, 0.0316799f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 192), uint8] */
  };
  %77 = fn (%FunctionVar_45_0: Tensor[(1, 28, 28, 256), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 32), uint8] {
    %75 = qnn.conv2d(%FunctionVar_45_0, meta[relay.Constant][42] /* ty=Tensor[(1, 1, 256, 32), uint8] */, 0 /* ty=int32 */, 129 /* ty=int32 */, 0.0475482f /* ty=float32 */, 0.00531897f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 32), int32] */;
    %76 = nn.bias_add(%75, meta[relay.Constant][43] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 32), int32] */;
    qnn.requantize(%76, 0.000252907f /* ty=float32 */, 0 /* ty=int32 */, 0.034475f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 32), uint8] */
  };
  %78 = %77(%64) /* ty=Tensor[(1, 28, 28, 32), uint8] */;
  %79 = fn (%FunctionVar_44_0: Tensor[(1, 28, 28, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 96), uint8] {
    %73 = qnn.conv2d(%FunctionVar_44_0, meta[relay.Constant][40] /* ty=Tensor[(3, 3, 32, 96), uint8] */, 0 /* ty=int32 */, 121 /* ty=int32 */, 0.034475f /* ty=float32 */, 0.00415084f /* ty=float32 */, padding=[1, 1, 1, 1], channels=96, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 96), int32] */;
    %74 = nn.bias_add(%73, meta[relay.Constant][41] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 96), int32] */;
    qnn.requantize(%74, 0.0001431f /* ty=float32 */, 0 /* ty=int32 */, 0.0277635f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 96), uint8] */
  };
  %82 = nn.max_pool2d(%64, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 28, 28, 256), uint8] */;
  %83 = fn (%FunctionVar_43_0: Tensor[(1, 28, 28, 256), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 64), uint8] {
    %80 = qnn.conv2d(%FunctionVar_43_0, meta[relay.Constant][44] /* ty=Tensor[(1, 1, 256, 64), uint8] */, 0 /* ty=int32 */, 129 /* ty=int32 */, 0.0475482f /* ty=float32 */, 0.00529972f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 64), int32] */;
    %81 = nn.bias_add(%80, meta[relay.Constant][45] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 64), int32] */;
    qnn.requantize(%81, 0.000251992f /* ty=float32 */, 0 /* ty=int32 */, 0.0281896f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 64), uint8] */
  };
  %84 = %65(%64) /* ty=Tensor[(1, 28, 28, 128), uint8] */;
  %85 = %72(%71) /* ty=Tensor[(1, 28, 28, 192), uint8] */;
  %86 = %79(%78) /* ty=Tensor[(1, 28, 28, 96), uint8] */;
  %87 = %83(%82) /* ty=Tensor[(1, 28, 28, 64), uint8] */;
  %88 = (%84, %85, %86, %87);
  %89 = (0.034585f /* ty=float32 */, 0.0316799f /* ty=float32 */, 0.0277635f /* ty=float32 */, 0.0281896f /* ty=float32 */);
  %90 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */);
  %91 = qnn.concatenate(%88, %89, %90, 0.034585f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 28, 28, 480), uint8] */;
  %92 = nn.max_pool2d(%91, pool_size=[3, 3], strides=[2, 2], padding=[0, 0, 1, 1], layout="NHWC") /* ty=Tensor[(1, 14, 14, 480), uint8] */;
  %93 = fn (%FunctionVar_42_0: Tensor[(1, 14, 14, 480), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 192), uint8] {
    %18 = qnn.conv2d(%FunctionVar_42_0, meta[relay.Constant][14] /* ty=Tensor[(1, 1, 480, 192), uint8] */, 0 /* ty=int32 */, 104 /* ty=int32 */, 0.034585f /* ty=float32 */, 0.00488506f /* ty=float32 */, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 192), int32] */;
    %19 = nn.bias_add(%18, meta[relay.Constant][15] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 192), int32] */;
    qnn.requantize(%19, 0.00016895f /* ty=float32 */, 0 /* ty=int32 */, 0.0350619f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 192), uint8] */
  };
  %98 = fn (%FunctionVar_41_0: Tensor[(1, 14, 14, 480), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 96), uint8] {
    %96 = qnn.conv2d(%FunctionVar_41_0, meta[relay.Constant][48] /* ty=Tensor[(1, 1, 480, 96), uint8] */, 0 /* ty=int32 */, 69 /* ty=int32 */, 0.034585f /* ty=float32 */, 0.00521668f /* ty=float32 */, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 96), int32] */;
    %97 = nn.bias_add(%96, meta[relay.Constant][49] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 96), int32] */;
    qnn.requantize(%97, 0.000180419f /* ty=float32 */, 0 /* ty=int32 */, 0.0407384f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 96), uint8] */
  };
  %99 = %98(%92) /* ty=Tensor[(1, 14, 14, 96), uint8] */;
  %100 = fn (%FunctionVar_40_0: Tensor[(1, 14, 14, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 208), uint8] {
    %94 = qnn.conv2d(%FunctionVar_40_0, meta[relay.Constant][46] /* ty=Tensor[(3, 3, 96, 208), uint8] */, 0 /* ty=int32 */, 80 /* ty=int32 */, 0.0407384f /* ty=float32 */, 0.00412294f /* ty=float32 */, padding=[1, 1, 1, 1], channels=208, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 208), int32] */;
    %95 = nn.bias_add(%94, meta[relay.Constant][47] /* ty=Tensor[(208), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 208), int32] */;
    qnn.requantize(%95, 0.000167962f /* ty=float32 */, 0 /* ty=int32 */, 0.038577f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 208), uint8] */
  };
  %105 = fn (%FunctionVar_39_0: Tensor[(1, 14, 14, 480), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 16), uint8] {
    %103 = qnn.conv2d(%FunctionVar_39_0, meta[relay.Constant][52] /* ty=Tensor[(1, 1, 480, 16), uint8] */, 0 /* ty=int32 */, 159 /* ty=int32 */, 0.034585f /* ty=float32 */, 0.00324746f /* ty=float32 */, padding=[0, 0, 0, 0], channels=16, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 16), int32] */;
    %104 = nn.bias_add(%103, meta[relay.Constant][53] /* ty=Tensor[(16), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 16), int32] */;
    qnn.requantize(%104, 0.000112313f /* ty=float32 */, 0 /* ty=int32 */, 0.029503f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 16), uint8] */
  };
  %106 = %105(%92) /* ty=Tensor[(1, 14, 14, 16), uint8] */;
  %107 = fn (%FunctionVar_38_0: Tensor[(1, 14, 14, 16), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 48), uint8] {
    %101 = qnn.conv2d(%FunctionVar_38_0, meta[relay.Constant][50] /* ty=Tensor[(3, 3, 16, 48), uint8] */, 0 /* ty=int32 */, 88 /* ty=int32 */, 0.029503f /* ty=float32 */, 0.00959363f /* ty=float32 */, padding=[1, 1, 1, 1], channels=48, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 48), int32] */;
    %102 = nn.bias_add(%101, meta[relay.Constant][51] /* ty=Tensor[(48), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 48), int32] */;
    qnn.requantize(%102, 0.000283041f /* ty=float32 */, 0 /* ty=int32 */, 0.0261499f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 48), uint8] */
  };
  %110 = nn.max_pool2d(%92, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 14, 14, 480), uint8] */;
  %111 = fn (%FunctionVar_37_0: Tensor[(1, 14, 14, 480), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] {
    %108 = qnn.conv2d(%FunctionVar_37_0, meta[relay.Constant][54] /* ty=Tensor[(1, 1, 480, 64), uint8] */, 0 /* ty=int32 */, 123 /* ty=int32 */, 0.034585f /* ty=float32 */, 0.0063726f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */;
    %109 = nn.bias_add(%108, meta[relay.Constant][55] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */;
    qnn.requantize(%109, 0.000220396f /* ty=float32 */, 0 /* ty=int32 */, 0.0227659f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */
  };
  %112 = %93(%92) /* ty=Tensor[(1, 14, 14, 192), uint8] */;
  %113 = %100(%99) /* ty=Tensor[(1, 14, 14, 208), uint8] */;
  %114 = %107(%106) /* ty=Tensor[(1, 14, 14, 48), uint8] */;
  %115 = %111(%110) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %116 = (%112, %113, %114, %115);
  %117 = (0.0350619f /* ty=float32 */, 0.038577f /* ty=float32 */, 0.0261499f /* ty=float32 */, 0.0227659f /* ty=float32 */);
  %118 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */);
  %119 = qnn.concatenate(%116, %117, %118, 0.038577f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 14, 14, 512), uint8] */;
  %120 = fn (%FunctionVar_36_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 160), uint8] {
    %16 = qnn.conv2d(%FunctionVar_36_0, meta[relay.Constant][12] /* ty=Tensor[(1, 1, 512, 160), uint8] */, 0 /* ty=int32 */, 131 /* ty=int32 */, 0.038577f /* ty=float32 */, 0.00565282f /* ty=float32 */, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 160), int32] */;
    %17 = nn.bias_add(%16, meta[relay.Constant][13] /* ty=Tensor[(160), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 160), int32] */;
    qnn.requantize(%17, 0.000218069f /* ty=float32 */, 0 /* ty=int32 */, 0.0384053f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 160), uint8] */
  };
  %125 = fn (%FunctionVar_35_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 112), uint8] {
    %123 = qnn.conv2d(%FunctionVar_35_0, meta[relay.Constant][58] /* ty=Tensor[(1, 1, 512, 112), uint8] */, 0 /* ty=int32 */, 111 /* ty=int32 */, 0.038577f /* ty=float32 */, 0.00606403f /* ty=float32 */, padding=[0, 0, 0, 0], channels=112, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 112), int32] */;
    %124 = nn.bias_add(%123, meta[relay.Constant][59] /* ty=Tensor[(112), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 112), int32] */;
    qnn.requantize(%124, 0.000233932f /* ty=float32 */, 0 /* ty=int32 */, 0.0390984f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 112), uint8] */
  };
  %126 = %125(%119) /* ty=Tensor[(1, 14, 14, 112), uint8] */;
  %127 = fn (%FunctionVar_34_0: Tensor[(1, 14, 14, 112), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 224), uint8] {
    %121 = qnn.conv2d(%FunctionVar_34_0, meta[relay.Constant][56] /* ty=Tensor[(3, 3, 112, 224), uint8] */, 0 /* ty=int32 */, 77 /* ty=int32 */, 0.0390984f /* ty=float32 */, 0.00476621f /* ty=float32 */, padding=[1, 1, 1, 1], channels=224, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 224), int32] */;
    %122 = nn.bias_add(%121, meta[relay.Constant][57] /* ty=Tensor[(224), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 224), int32] */;
    qnn.requantize(%122, 0.000186351f /* ty=float32 */, 0 /* ty=int32 */, 0.0415277f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 224), uint8] */
  };
  %132 = fn (%FunctionVar_33_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 24), uint8] {
    %130 = qnn.conv2d(%FunctionVar_33_0, meta[relay.Constant][62] /* ty=Tensor[(1, 1, 512, 24), uint8] */, 0 /* ty=int32 */, 127 /* ty=int32 */, 0.038577f /* ty=float32 */, 0.00466451f /* ty=float32 */, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 24), int32] */;
    %131 = nn.bias_add(%130, meta[relay.Constant][63] /* ty=Tensor[(24), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 24), int32] */;
    qnn.requantize(%131, 0.000179943f /* ty=float32 */, 0 /* ty=int32 */, 0.0326719f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 24), uint8] */
  };
  %133 = %132(%119) /* ty=Tensor[(1, 14, 14, 24), uint8] */;
  %134 = fn (%FunctionVar_32_0: Tensor[(1, 14, 14, 24), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] {
    %128 = qnn.conv2d(%FunctionVar_32_0, meta[relay.Constant][60] /* ty=Tensor[(3, 3, 24, 64), uint8] */, 0 /* ty=int32 */, 105 /* ty=int32 */, 0.0326719f /* ty=float32 */, 0.00475245f /* ty=float32 */, padding=[1, 1, 1, 1], channels=64, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */;
    %129 = nn.bias_add(%128, meta[relay.Constant][61] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */;
    qnn.requantize(%129, 0.000155272f /* ty=float32 */, 0 /* ty=int32 */, 0.0353133f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */
  };
  %137 = nn.max_pool2d(%119, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 14, 14, 512), uint8] */;
  %138 = fn (%FunctionVar_31_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] {
    %135 = qnn.conv2d(%FunctionVar_31_0, meta[relay.Constant][64] /* ty=Tensor[(1, 1, 512, 64), uint8] */, 0 /* ty=int32 */, 128 /* ty=int32 */, 0.038577f /* ty=float32 */, 0.00292699f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */;
    %136 = nn.bias_add(%135, meta[relay.Constant][65] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */;
    qnn.requantize(%136, 0.000112914f /* ty=float32 */, 0 /* ty=int32 */, 0.0217496f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */
  };
  %139 = %120(%119) /* ty=Tensor[(1, 14, 14, 160), uint8] */;
  %140 = %127(%126) /* ty=Tensor[(1, 14, 14, 224), uint8] */;
  %141 = %134(%133) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %142 = %138(%137) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %143 = (%139, %140, %141, %142);
  %144 = (0.0384053f /* ty=float32 */, 0.0415277f /* ty=float32 */, 0.0353133f /* ty=float32 */, 0.0217496f /* ty=float32 */);
  %145 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */);
  %146 = qnn.concatenate(%143, %144, %145, 0.0415277f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 14, 14, 512), uint8] */;
  %147 = fn (%FunctionVar_30_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 128), uint8] {
    %14 = qnn.conv2d(%FunctionVar_30_0, meta[relay.Constant][10] /* ty=Tensor[(1, 1, 512, 128), uint8] */, 0 /* ty=int32 */, 143 /* ty=int32 */, 0.0415277f /* ty=float32 */, 0.00513341f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 128), int32] */;
    %15 = nn.bias_add(%14, meta[relay.Constant][11] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 128), int32] */;
    qnn.requantize(%15, 0.000213179f /* ty=float32 */, 0 /* ty=int32 */, 0.0363159f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 128), uint8] */
  };
  %152 = fn (%FunctionVar_29_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 128), uint8] {
    %150 = qnn.conv2d(%FunctionVar_29_0, meta[relay.Constant][68] /* ty=Tensor[(1, 1, 512, 128), uint8] */, 0 /* ty=int32 */, 125 /* ty=int32 */, 0.0415277f /* ty=float32 */, 0.0056437f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 128), int32] */;
    %151 = nn.bias_add(%150, meta[relay.Constant][69] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 128), int32] */;
    qnn.requantize(%151, 0.00023437f /* ty=float32 */, 0 /* ty=int32 */, 0.0444829f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 128), uint8] */
  };
  %153 = %152(%146) /* ty=Tensor[(1, 14, 14, 128), uint8] */;
  %154 = fn (%FunctionVar_28_0: Tensor[(1, 14, 14, 128), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 256), uint8] {
    %148 = qnn.conv2d(%FunctionVar_28_0, meta[relay.Constant][66] /* ty=Tensor[(3, 3, 128, 256), uint8] */, 0 /* ty=int32 */, 104 /* ty=int32 */, 0.0444829f /* ty=float32 */, 0.00298305f /* ty=float32 */, padding=[1, 1, 1, 1], channels=256, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 256), int32] */;
    %149 = nn.bias_add(%148, meta[relay.Constant][67] /* ty=Tensor[(256), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 256), int32] */;
    qnn.requantize(%149, 0.000132695f /* ty=float32 */, 0 /* ty=int32 */, 0.040194f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 256), uint8] */
  };
  %159 = fn (%FunctionVar_27_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 24), uint8] {
    %157 = qnn.conv2d(%FunctionVar_27_0, meta[relay.Constant][72] /* ty=Tensor[(1, 1, 512, 24), uint8] */, 0 /* ty=int32 */, 96 /* ty=int32 */, 0.0415277f /* ty=float32 */, 0.00617409f /* ty=float32 */, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 24), int32] */;
    %158 = nn.bias_add(%157, meta[relay.Constant][73] /* ty=Tensor[(24), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 24), int32] */;
    qnn.requantize(%158, 0.000256396f /* ty=float32 */, 0 /* ty=int32 */, 0.0382293f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 24), uint8] */
  };
  %160 = %159(%146) /* ty=Tensor[(1, 14, 14, 24), uint8] */;
  %161 = fn (%FunctionVar_26_0: Tensor[(1, 14, 14, 24), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] {
    %155 = qnn.conv2d(%FunctionVar_26_0, meta[relay.Constant][70] /* ty=Tensor[(3, 3, 24, 64), uint8] */, 0 /* ty=int32 */, 90 /* ty=int32 */, 0.0382293f /* ty=float32 */, 0.00926049f /* ty=float32 */, padding=[1, 1, 1, 1], channels=64, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */;
    %156 = nn.bias_add(%155, meta[relay.Constant][71] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */;
    qnn.requantize(%156, 0.000354022f /* ty=float32 */, 0 /* ty=int32 */, 0.0679776f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */
  };
  %164 = nn.max_pool2d(%146, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 14, 14, 512), uint8] */;
  %165 = fn (%FunctionVar_25_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] {
    %162 = qnn.conv2d(%FunctionVar_25_0, meta[relay.Constant][74] /* ty=Tensor[(1, 1, 512, 64), uint8] */, 0 /* ty=int32 */, 133 /* ty=int32 */, 0.0415277f /* ty=float32 */, 0.00348826f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */;
    %163 = nn.bias_add(%162, meta[relay.Constant][75] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */;
    qnn.requantize(%163, 0.00014486f /* ty=float32 */, 0 /* ty=int32 */, 0.0225817f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */
  };
  %166 = %147(%146) /* ty=Tensor[(1, 14, 14, 128), uint8] */;
  %167 = %154(%153) /* ty=Tensor[(1, 14, 14, 256), uint8] */;
  %168 = %161(%160) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %169 = %165(%164) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %170 = (%166, %167, %168, %169);
  %171 = (0.0363159f /* ty=float32 */, 0.040194f /* ty=float32 */, 0.0679776f /* ty=float32 */, 0.0225817f /* ty=float32 */);
  %172 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */);
  %173 = qnn.concatenate(%170, %171, %172, 0.0679776f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 14, 14, 512), uint8] */;
  %174 = fn (%FunctionVar_24_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 112), uint8] {
    %12 = qnn.conv2d(%FunctionVar_24_0, meta[relay.Constant][8] /* ty=Tensor[(1, 1, 512, 112), uint8] */, 0 /* ty=int32 */, 131 /* ty=int32 */, 0.0679776f /* ty=float32 */, 0.00541721f /* ty=float32 */, padding=[0, 0, 0, 0], channels=112, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 112), int32] */;
    %13 = nn.bias_add(%12, meta[relay.Constant][9] /* ty=Tensor[(112), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 112), int32] */;
    qnn.requantize(%13, 0.000368249f /* ty=float32 */, 0 /* ty=int32 */, 0.0520244f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 112), uint8] */
  };
  %179 = fn (%FunctionVar_23_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 144), uint8] {
    %177 = qnn.conv2d(%FunctionVar_23_0, meta[relay.Constant][78] /* ty=Tensor[(1, 1, 512, 144), uint8] */, 0 /* ty=int32 */, 102 /* ty=int32 */, 0.0679776f /* ty=float32 */, 0.00529131f /* ty=float32 */, padding=[0, 0, 0, 0], channels=144, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 144), int32] */;
    %178 = nn.bias_add(%177, meta[relay.Constant][79] /* ty=Tensor[(144), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 144), int32] */;
    qnn.requantize(%178, 0.000359691f /* ty=float32 */, 0 /* ty=int32 */, 0.0464631f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 144), uint8] */
  };
  %180 = %179(%173) /* ty=Tensor[(1, 14, 14, 144), uint8] */;
  %181 = fn (%FunctionVar_22_0: Tensor[(1, 14, 14, 144), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 288), uint8] {
    %175 = qnn.conv2d(%FunctionVar_22_0, meta[relay.Constant][76] /* ty=Tensor[(3, 3, 144, 288), uint8] */, 0 /* ty=int32 */, 121 /* ty=int32 */, 0.0464631f /* ty=float32 */, 0.00281512f /* ty=float32 */, padding=[1, 1, 1, 1], channels=288, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 288), int32] */;
    %176 = nn.bias_add(%175, meta[relay.Constant][77] /* ty=Tensor[(288), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 288), int32] */;
    qnn.requantize(%176, 0.000130799f /* ty=float32 */, 0 /* ty=int32 */, 0.0511231f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 288), uint8] */
  };
  %186 = fn (%FunctionVar_21_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 32), uint8] {
    %184 = qnn.conv2d(%FunctionVar_21_0, meta[relay.Constant][82] /* ty=Tensor[(1, 1, 512, 32), uint8] */, 0 /* ty=int32 */, 129 /* ty=int32 */, 0.0679776f /* ty=float32 */, 0.00454161f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 32), int32] */;
    %185 = nn.bias_add(%184, meta[relay.Constant][83] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 32), int32] */;
    qnn.requantize(%185, 0.000308728f /* ty=float32 */, 0 /* ty=int32 */, 0.0439514f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 32), uint8] */
  };
  %187 = %186(%173) /* ty=Tensor[(1, 14, 14, 32), uint8] */;
  %188 = fn (%FunctionVar_20_0: Tensor[(1, 14, 14, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] {
    %182 = qnn.conv2d(%FunctionVar_20_0, meta[relay.Constant][80] /* ty=Tensor[(3, 3, 32, 64), uint8] */, 0 /* ty=int32 */, 92 /* ty=int32 */, 0.0439514f /* ty=float32 */, 0.00496321f /* ty=float32 */, padding=[1, 1, 1, 1], channels=64, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */;
    %183 = nn.bias_add(%182, meta[relay.Constant][81] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */;
    qnn.requantize(%183, 0.00021814f /* ty=float32 */, 0 /* ty=int32 */, 0.0310861f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */
  };
  %191 = nn.max_pool2d(%173, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 14, 14, 512), uint8] */;
  %192 = fn (%FunctionVar_19_0: Tensor[(1, 14, 14, 512), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] {
    %189 = qnn.conv2d(%FunctionVar_19_0, meta[relay.Constant][84] /* ty=Tensor[(1, 1, 512, 64), uint8] */, 0 /* ty=int32 */, 124 /* ty=int32 */, 0.0679776f /* ty=float32 */, 0.00317437f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */;
    %190 = nn.bias_add(%189, meta[relay.Constant][85] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */;
    qnn.requantize(%190, 0.000215786f /* ty=float32 */, 0 /* ty=int32 */, 0.024479f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */
  };
  %193 = %174(%173) /* ty=Tensor[(1, 14, 14, 112), uint8] */;
  %194 = %181(%180) /* ty=Tensor[(1, 14, 14, 288), uint8] */;
  %195 = %188(%187) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %196 = %192(%191) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %197 = (%193, %194, %195, %196);
  %198 = (0.0520244f /* ty=float32 */, 0.0511231f /* ty=float32 */, 0.0310861f /* ty=float32 */, 0.024479f /* ty=float32 */);
  %199 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */);
  %200 = qnn.concatenate(%197, %198, %199, 0.0520244f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 14, 14, 528), uint8] */;
  %201 = fn (%FunctionVar_18_0: Tensor[(1, 14, 14, 528), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 256), uint8] {
    %10 = qnn.conv2d(%FunctionVar_18_0, meta[relay.Constant][6] /* ty=Tensor[(1, 1, 528, 256), uint8] */, 0 /* ty=int32 */, 118 /* ty=int32 */, 0.0520244f /* ty=float32 */, 0.00557758f /* ty=float32 */, padding=[0, 0, 0, 0], channels=256, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 256), int32] */;
    %11 = nn.bias_add(%10, meta[relay.Constant][7] /* ty=Tensor[(256), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 256), int32] */;
    qnn.requantize(%11, 0.00029017f /* ty=float32 */, 0 /* ty=int32 */, 0.0461338f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 256), uint8] */
  };
  %206 = fn (%FunctionVar_17_0: Tensor[(1, 14, 14, 528), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 160), uint8] {
    %204 = qnn.conv2d(%FunctionVar_17_0, meta[relay.Constant][88] /* ty=Tensor[(1, 1, 528, 160), uint8] */, 0 /* ty=int32 */, 105 /* ty=int32 */, 0.0520244f /* ty=float32 */, 0.00543337f /* ty=float32 */, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 160), int32] */;
    %205 = nn.bias_add(%204, meta[relay.Constant][89] /* ty=Tensor[(160), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 160), int32] */;
    qnn.requantize(%205, 0.000282668f /* ty=float32 */, 0 /* ty=int32 */, 0.0368424f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 160), uint8] */
  };
  %207 = %206(%200) /* ty=Tensor[(1, 14, 14, 160), uint8] */;
  %208 = fn (%FunctionVar_16_0: Tensor[(1, 14, 14, 160), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 320), uint8] {
    %202 = qnn.conv2d(%FunctionVar_16_0, meta[relay.Constant][86] /* ty=Tensor[(3, 3, 160, 320), uint8] */, 0 /* ty=int32 */, 85 /* ty=int32 */, 0.0368424f /* ty=float32 */, 0.00295774f /* ty=float32 */, padding=[1, 1, 1, 1], channels=320, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 320), int32] */;
    %203 = nn.bias_add(%202, meta[relay.Constant][87] /* ty=Tensor[(320), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 320), int32] */;
    qnn.requantize(%203, 0.00010897f /* ty=float32 */, 0 /* ty=int32 */, 0.0384801f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 320), uint8] */
  };
  %213 = fn (%FunctionVar_15_0: Tensor[(1, 14, 14, 528), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 32), uint8] {
    %211 = qnn.conv2d(%FunctionVar_15_0, meta[relay.Constant][92] /* ty=Tensor[(1, 1, 528, 32), uint8] */, 0 /* ty=int32 */, 126 /* ty=int32 */, 0.0520244f /* ty=float32 */, 0.00506661f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 32), int32] */;
    %212 = nn.bias_add(%211, meta[relay.Constant][93] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 32), int32] */;
    qnn.requantize(%212, 0.000263587f /* ty=float32 */, 0 /* ty=int32 */, 0.0576595f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 32), uint8] */
  };
  %214 = %213(%200) /* ty=Tensor[(1, 14, 14, 32), uint8] */;
  %215 = fn (%FunctionVar_14_0: Tensor[(1, 14, 14, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 128), uint8] {
    %209 = qnn.conv2d(%FunctionVar_14_0, meta[relay.Constant][90] /* ty=Tensor[(3, 3, 32, 128), uint8] */, 0 /* ty=int32 */, 81 /* ty=int32 */, 0.0576595f /* ty=float32 */, 0.00359061f /* ty=float32 */, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 128), int32] */;
    %210 = nn.bias_add(%209, meta[relay.Constant][91] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 128), int32] */;
    qnn.requantize(%210, 0.000207033f /* ty=float32 */, 0 /* ty=int32 */, 0.0713473f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 128), uint8] */
  };
  %218 = nn.max_pool2d(%200, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 14, 14, 528), uint8] */;
  %219 = fn (%FunctionVar_13_0: Tensor[(1, 14, 14, 528), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 128), uint8] {
    %216 = qnn.conv2d(%FunctionVar_13_0, meta[relay.Constant][94] /* ty=Tensor[(1, 1, 528, 128), uint8] */, 0 /* ty=int32 */, 94 /* ty=int32 */, 0.0520244f /* ty=float32 */, 0.00317797f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 128), int32] */;
    %217 = nn.bias_add(%216, meta[relay.Constant][95] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 128), int32] */;
    qnn.requantize(%217, 0.000165332f /* ty=float32 */, 0 /* ty=int32 */, 0.0265916f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 128), uint8] */
  };
  %220 = %201(%200) /* ty=Tensor[(1, 14, 14, 256), uint8] */;
  %221 = %208(%207) /* ty=Tensor[(1, 14, 14, 320), uint8] */;
  %222 = %215(%214) /* ty=Tensor[(1, 14, 14, 128), uint8] */;
  %223 = %219(%218) /* ty=Tensor[(1, 14, 14, 128), uint8] */;
  %224 = (%220, %221, %222, %223);
  %225 = (0.0461338f /* ty=float32 */, 0.0384801f /* ty=float32 */, 0.0713473f /* ty=float32 */, 0.0265916f /* ty=float32 */);
  %226 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */);
  %227 = qnn.concatenate(%224, %225, %226, 0.0713473f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 14, 14, 832), uint8] */;
  %228 = nn.max_pool2d(%227, pool_size=[2, 2], strides=[2, 2], padding=[0, 0, 0, 0], layout="NHWC") /* ty=Tensor[(1, 7, 7, 832), uint8] */;
  %229 = fn (%FunctionVar_12_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 256), uint8] {
    %8 = qnn.conv2d(%FunctionVar_12_0, meta[relay.Constant][4] /* ty=Tensor[(1, 1, 832, 256), uint8] */, 0 /* ty=int32 */, 182 /* ty=int32 */, 0.0713473f /* ty=float32 */, 0.0104061f /* ty=float32 */, padding=[0, 0, 0, 0], channels=256, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 256), int32] */;
    %9 = nn.bias_add(%8, meta[relay.Constant][5] /* ty=Tensor[(256), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 256), int32] */;
    qnn.requantize(%9, 0.000742446f /* ty=float32 */, 0 /* ty=int32 */, 0.036752f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 256), uint8] */
  };
  %234 = fn (%FunctionVar_11_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 160), uint8] {
    %232 = qnn.conv2d(%FunctionVar_11_0, meta[relay.Constant][98] /* ty=Tensor[(1, 1, 832, 160), uint8] */, 0 /* ty=int32 */, 115 /* ty=int32 */, 0.0713473f /* ty=float32 */, 0.00596868f /* ty=float32 */, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 160), int32] */;
    %233 = nn.bias_add(%232, meta[relay.Constant][99] /* ty=Tensor[(160), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 160), int32] */;
    qnn.requantize(%233, 0.000425849f /* ty=float32 */, 0 /* ty=int32 */, 0.0490709f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 160), uint8] */
  };
  %235 = %234(%228) /* ty=Tensor[(1, 7, 7, 160), uint8] */;
  %236 = fn (%FunctionVar_10_0: Tensor[(1, 7, 7, 160), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 320), uint8] {
    %230 = qnn.conv2d(%FunctionVar_10_0, meta[relay.Constant][96] /* ty=Tensor[(3, 3, 160, 320), uint8] */, 0 /* ty=int32 */, 129 /* ty=int32 */, 0.0490709f /* ty=float32 */, 0.00293286f /* ty=float32 */, padding=[1, 1, 1, 1], channels=320, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 320), int32] */;
    %231 = nn.bias_add(%230, meta[relay.Constant][97] /* ty=Tensor[(320), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 320), int32] */;
    qnn.requantize(%231, 0.000143918f /* ty=float32 */, 0 /* ty=int32 */, 0.0450282f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 320), uint8] */
  };
  %241 = fn (%FunctionVar_9_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 32), uint8] {
    %239 = qnn.conv2d(%FunctionVar_9_0, meta[relay.Constant][102] /* ty=Tensor[(1, 1, 832, 32), uint8] */, 0 /* ty=int32 */, 122 /* ty=int32 */, 0.0713473f /* ty=float32 */, 0.00383815f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 32), int32] */;
    %240 = nn.bias_add(%239, meta[relay.Constant][103] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 32), int32] */;
    qnn.requantize(%240, 0.000273842f /* ty=float32 */, 0 /* ty=int32 */, 0.0411565f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 32), uint8] */
  };
  %242 = %241(%228) /* ty=Tensor[(1, 7, 7, 32), uint8] */;
  %243 = fn (%FunctionVar_8_0: Tensor[(1, 7, 7, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 128), uint8] {
    %237 = qnn.conv2d(%FunctionVar_8_0, meta[relay.Constant][100] /* ty=Tensor[(3, 3, 32, 128), uint8] */, 0 /* ty=int32 */, 102 /* ty=int32 */, 0.0411565f /* ty=float32 */, 0.002763f /* ty=float32 */, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 128), int32] */;
    %238 = nn.bias_add(%237, meta[relay.Constant][101] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 128), int32] */;
    qnn.requantize(%238, 0.000113715f /* ty=float32 */, 0 /* ty=int32 */, 0.0371453f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 128), uint8] */
  };
  %246 = nn.max_pool2d(%228, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 7, 7, 832), uint8] */;
  %247 = fn (%FunctionVar_7_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 128), uint8] {
    %244 = qnn.conv2d(%FunctionVar_7_0, meta[relay.Constant][104] /* ty=Tensor[(1, 1, 832, 128), uint8] */, 0 /* ty=int32 */, 123 /* ty=int32 */, 0.0713473f /* ty=float32 */, 0.00247852f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 128), int32] */;
    %245 = nn.bias_add(%244, meta[relay.Constant][105] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 128), int32] */;
    qnn.requantize(%245, 0.000176836f /* ty=float32 */, 0 /* ty=int32 */, 0.0213327f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 128), uint8] */
  };
  %248 = %229(%228) /* ty=Tensor[(1, 7, 7, 256), uint8] */;
  %249 = %236(%235) /* ty=Tensor[(1, 7, 7, 320), uint8] */;
  %250 = %243(%242) /* ty=Tensor[(1, 7, 7, 128), uint8] */;
  %251 = %247(%246) /* ty=Tensor[(1, 7, 7, 128), uint8] */;
  %252 = (%248, %249, %250, %251);
  %253 = (0.036752f /* ty=float32 */, 0.0450282f /* ty=float32 */, 0.0371453f /* ty=float32 */, 0.0213327f /* ty=float32 */);
  %254 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */);
  %255 = qnn.concatenate(%252, %253, %254, 0.0450282f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 7, 7, 832), uint8] */;
  %256 = fn (%FunctionVar_6_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 384), uint8] {
    %6 = qnn.conv2d(%FunctionVar_6_0, meta[relay.Constant][2] /* ty=Tensor[(1, 1, 832, 384), uint8] */, 0 /* ty=int32 */, 104 /* ty=int32 */, 0.0450282f /* ty=float32 */, 0.0143784f /* ty=float32 */, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 384), int32] */;
    %7 = nn.bias_add(%6, meta[relay.Constant][3] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 384), int32] */;
    qnn.requantize(%7, 0.000647432f /* ty=float32 */, 0 /* ty=int32 */, 0.0470831f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 384), uint8] */
  };
  %261 = fn (%FunctionVar_5_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 192), uint8] {
    %259 = qnn.conv2d(%FunctionVar_5_0, meta[relay.Constant][108] /* ty=Tensor[(1, 1, 832, 192), uint8] */, 0 /* ty=int32 */, 81 /* ty=int32 */, 0.0450282f /* ty=float32 */, 0.00580293f /* ty=float32 */, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 192), int32] */;
    %260 = nn.bias_add(%259, meta[relay.Constant][109] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 192), int32] */;
    qnn.requantize(%260, 0.000261295f /* ty=float32 */, 0 /* ty=int32 */, 0.0443907f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 192), uint8] */
  };
  %262 = %261(%255) /* ty=Tensor[(1, 7, 7, 192), uint8] */;
  %263 = fn (%FunctionVar_4_0: Tensor[(1, 7, 7, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 384), uint8] {
    %257 = qnn.conv2d(%FunctionVar_4_0, meta[relay.Constant][106] /* ty=Tensor[(3, 3, 192, 384), uint8] */, 0 /* ty=int32 */, 83 /* ty=int32 */, 0.0443907f /* ty=float32 */, 0.00505402f /* ty=float32 */, padding=[1, 1, 1, 1], channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 384), int32] */;
    %258 = nn.bias_add(%257, meta[relay.Constant][107] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 384), int32] */;
    qnn.requantize(%258, 0.000224351f /* ty=float32 */, 0 /* ty=int32 */, 0.0483342f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 384), uint8] */
  };
  %268 = fn (%FunctionVar_3_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 48), uint8] {
    %266 = qnn.conv2d(%FunctionVar_3_0, meta[relay.Constant][112] /* ty=Tensor[(1, 1, 832, 48), uint8] */, 0 /* ty=int32 */, 87 /* ty=int32 */, 0.0450282f /* ty=float32 */, 0.00578726f /* ty=float32 */, padding=[0, 0, 0, 0], channels=48, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 48), int32] */;
    %267 = nn.bias_add(%266, meta[relay.Constant][113] /* ty=Tensor[(48), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 48), int32] */;
    qnn.requantize(%267, 0.00026059f /* ty=float32 */, 0 /* ty=int32 */, 0.0431175f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 48), uint8] */
  };
  %269 = %268(%255) /* ty=Tensor[(1, 7, 7, 48), uint8] */;
  %270 = fn (%FunctionVar_2_0: Tensor[(1, 7, 7, 48), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 128), uint8] {
    %264 = qnn.conv2d(%FunctionVar_2_0, meta[relay.Constant][110] /* ty=Tensor[(3, 3, 48, 128), uint8] */, 0 /* ty=int32 */, 74 /* ty=int32 */, 0.0431175f /* ty=float32 */, 0.00680263f /* ty=float32 */, padding=[1, 1, 1, 1], channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 128), int32] */;
    %265 = nn.bias_add(%264, meta[relay.Constant][111] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 128), int32] */;
    qnn.requantize(%265, 0.000293312f /* ty=float32 */, 0 /* ty=int32 */, 0.0535589f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 128), uint8] */
  };
  %273 = nn.max_pool2d(%255, pool_size=[3, 3], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 7, 7, 832), uint8] */;
  %274 = fn (%FunctionVar_1_0: Tensor[(1, 7, 7, 832), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 128), uint8] {
    %271 = qnn.conv2d(%FunctionVar_1_0, meta[relay.Constant][114] /* ty=Tensor[(1, 1, 832, 128), uint8] */, 0 /* ty=int32 */, 62 /* ty=int32 */, 0.0450282f /* ty=float32 */, 0.0055094f /* ty=float32 */, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 128), int32] */;
    %272 = nn.bias_add(%271, meta[relay.Constant][115] /* ty=Tensor[(128), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 128), int32] */;
    qnn.requantize(%272, 0.000248078f /* ty=float32 */, 0 /* ty=int32 */, 0.0320987f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 128), uint8] */
  };
  %275 = %256(%255) /* ty=Tensor[(1, 7, 7, 384), uint8] */;
  %276 = %263(%262) /* ty=Tensor[(1, 7, 7, 384), uint8] */;
  %277 = %270(%269) /* ty=Tensor[(1, 7, 7, 128), uint8] */;
  %278 = %274(%273) /* ty=Tensor[(1, 7, 7, 128), uint8] */;
  %279 = (%275, %276, %277, %278);
  %280 = (0.0470831f /* ty=float32 */, 0.0483342f /* ty=float32 */, 0.0535589f /* ty=float32 */, 0.0320987f /* ty=float32 */);
  %281 = (0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */, 0 /* ty=int32 */);
  %282 = qnn.concatenate(%279, %280, %281, 0.0535589f /* ty=float32 */, 0 /* ty=int32 */, axis=3) /* ty=Tensor[(1, 7, 7, 1024), uint8] */;
  %283 = fn (%FunctionVar_0_02: Tensor[(1, 7, 7, 1024), uint8], PartitionedFromPattern="cast_nn.avg_pool2d_cast_", Composite="vsi_npu.qnn_avgpool2d") -> Tensor[(1, 1, 1, 1024), uint8] {
    %4 = cast(%FunctionVar_0_02, dtype="int32") /* ty=Tensor[(1, 7, 7, 1024), int32] */;
    %5 = nn.avg_pool2d(%4, pool_size=[7, 7], padding=[0, 0, 0, 0], layout="NHWC") /* ty=Tensor[(1, 1, 1, 1024), int32] */;
    cast(%5, dtype="uint8") /* ty=Tensor[(1, 1, 1, 1024), uint8] */
  };
  %284 = %283(%282) /* ty=Tensor[(1, 1, 1, 1024), uint8] */;
  %285 = fn (%FunctionVar_0_01: Tensor[(1, 1, 1, 1024), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 1, 1, 1001), uint8] {
    %2 = qnn.conv2d(%FunctionVar_0_01, meta[relay.Constant][0] /* ty=Tensor[(1, 1, 1024, 1001), uint8] */, 0 /* ty=int32 */, 106 /* ty=int32 */, 0.0535589f /* ty=float32 */, 0.00235748f /* ty=float32 */, padding=[0, 0, 0, 0], channels=1001, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 1, 1, 1001), int32] */;
    %3 = nn.bias_add(%2, meta[relay.Constant][1] /* ty=Tensor[(1001), int32] */, axis=3) /* ty=Tensor[(1, 1, 1, 1001), int32] */;
    qnn.requantize(%3, 0.000126264f /* ty=float32 */, 0 /* ty=int32 */, 0.0962827f /* ty=float32 */, 60 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 1, 1, 1001), uint8] */
  };
  %286 = %285(%284) /* ty=Tensor[(1, 1, 1, 1001), uint8] */;
  %287 = reshape(%286, newshape=[-1, 1001]) /* ty=Tensor[(1, 1001), uint8] */;
  %288 = fn (%FunctionVar_0_0: Tensor[(1, 1001), uint8], PartitionedFromPattern="qnn.dequantize_nn.softmax_qnn.quantize_", Composite="vsi_npu.qnn_softmax") -> Tensor[(1, 1001), uint8] {
    %0 = qnn.dequantize(%FunctionVar_0_0, 0.0962827f /* ty=float32 */, 60 /* ty=int32 */) /* ty=Tensor[(1, 1001), float32] */;
    %1 = nn.softmax(%0, axis=1) /* ty=Tensor[(1, 1001), float32] */;
    qnn.quantize(%1, 0.00390625f /* ty=float32 */, 0 /* ty=int32 */, out_dtype="uint8") /* ty=Tensor[(1, 1001), uint8] */
  };
  %288(%287) /* ty=Tensor[(1, 1001), uint8] */
}


This is important----> name_node.value() == tvmgen_default_vsi_npu_0
GraphMakerImpl::Create
TensorMakerImpl::InferCall: vsi_npu.qnn_softmax
TensorMakerImpl::InferCall: reshape
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_avgpool2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.concatenate
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: nn.max_pool2d
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
GraphMakerImpl::VisitExpr_(TupleNode): 4
W [HandleLayoutInfer:268]Op 162: default layout inference pass.
VsiNpuModule::GetFunction: get_symbol
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: get_const_vars
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: get_const_vars
VsiNpuModule::GetFunction: return early
INCtest_vsi_tflite_model_all.py:120: DeprecationWarning: legacy graph executor behavior of producing json / lib / params will be removed in the next release. Please see documents of tvm.contrib.graph_executor.GraphModule for the  new recommended usage.
  graph, lib, params  = relay.build(mod, target, params=params)
VsiNpuModule::SaveToBinary
SaveToBinary: nbg size = 6884160
SaveToBinary: input size = 1
SaveToBinary: output size = 1
VsiNpuModule : SerializeTensorSpec
VsiNpuModule : SerializeTensorSpec2
VsiNpuModule : SerializeTensorSpec
VsiNpuModule : SerializeTensorSpec2
VsiNpuModule::SaveToBinary2
Printing device code to device_code.cl...
VsiNpuModule::LoadFromBinary
LoadFromBinary: nbg size = 6884160
LoadFromBinary: input size = 1
LoadFromBinary: output size = 1
VsiNpuModule : DeSerializeTensorSpec
VsiNpuModule : DeSerializeTensorSpec2
VsiNpuModule : DeSerializeTensorSpec
VsiNpuModule : DeSerializeTensorSpec2
(1, 224, 224, 3) ############
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: tvmgen_default_vsi_npu_0
Process Graph: 7 ms or 7232 us
VsiNpuModule::GetFunction: size: 2
[[  0   0 254   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0]]


@fumao13579
Copy link
Author

fumao13579 commented May 23, 2023

@sunshinemyson
https://drive.google.com/drive/folders/1iX8GyEWZbzoAdiuW0AcU-4pApwhtldFp?usp=share_link

So far, I have tried executing MobileNetV2 and SqueezeNet TFLite models with TVM RPC or performing local compilation on Khadas VIM3 Pro, and both resulted in normal output.

Here is the output when running test_vsi_pytorch_model_all.py with the quantized tflite model MobileNetV2.

x86_64 Host

#productname=VSI SIMULATOR, pid=0x88
1. press any key and continue...
vsi_npu.py --> qnn.dequantize
vsi_npu.py --> nn.softmax
vsi_npu.py --> qnn.quantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.avg_pool2d
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> reshape
This is important----> name_node.value() == tvmgen_default_vsi_npu_0
GraphMakerImpl::Create
graph gpuCount=1 interConnectRingCount=0
NN ring buffer is disabled
TensorMakerImpl::InferCall: vsi_npu.qnn_softmax
TensorMakerImpl::InferCall: reshape
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_avgpool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
graph gpuCount=1 interConnectRingCount=0
NN ring buffer is disabled
W [HandleLayoutInfer:268]Op 162: default layout inference pass.
---------------------------Begin VerifyTiling -------------------------
AXI-SRAM = 1048320 Bytes VIP-SRAM = 522240 Bytes SWTILING_PHASE_FEATURES[0, 0, 0]
  0 TP [(   3  224  224 1,   150528, 0x0x2bedf70(0x0x2bedf70, 0x(nil)) ->  224  224    3 1,   150528, 0x0x2db3860(0x0x2db3860, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(1 1, 1 1)] C[  1]
  1 TP [( 224  224    3 1,   150528, 0x0x2db3860(0x0x2db3860, 0x(nil)) ->  113  113   12 1,   153228, 0x0x49dcd70(0x0x49dcd70, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(1 1, 1 1)] P[  0] C[  2]
  2 NN [( 113  113   12 1,   153228, 0x0x49dcd70(0x0x49dcd70, 0x(nil)) ->  112  112   32 1,   401408, 0x0x2db4fe0(0x0x2db4fe0, 0x(nil))) k(2 2   12, 1792) pad(0 0) pool(1 1, 1 1)] P[  1] C[  3]
  3 NN [( 112  112   32 1,   401408, 0x0x2db4fe0(0x0x2db4fe0, 0x(nil)) ->  112  112   32 1,   401408, 0x0x2db82b0(0x0x2db82b0, 0x(nil))) k(3 3   32, 11392) pad(1 1) pool(1 1, 1 1)] P[  2] C[  4]
  4 NN [( 112  112   32 1,   401408, 0x0x2db82b0(0x0x2db82b0, 0x(nil)) ->  112  112   16 1,   200704, 0x0x2dbbc10(0x0x2dbbc10, 0x(nil))) k(1 1   32, 640) pad(0 0) pool(1 1, 1 1)] P[  3] C[  5]
  5 NN [( 112  112   16 1,   200704, 0x0x2dbbc10(0x0x2dbbc10, 0x(nil)) ->  112  112   96 1,  1204224, 0x0x2dbf550(0x0x2dbf550, 0x(nil))) k(1 1   16, 2048) pad(0 0) pool(1 1, 1 1)] P[  4] C[  6]
  6 NN [( 112  112   96 1,  1204224, 0x0x2dbf550(0x0x2dbf550, 0x(nil)) ->   56   56   96 1,   301056, 0x0x2dc2ed0(0x0x2dc2ed0, 0x(nil))) k(3 3   96, 103040) pad(0 0) pool(2 2, 2 2)] P[  5] C[  7]
  7 NN [(  56   56   96 1,   301056, 0x0x2dc2ed0(0x0x2dc2ed0, 0x(nil)) ->   56   56   24 1,    75264, 0x0xa8053b0(0x0x30953b0, 0x0x12600)) k(1 1   96, 2560) pad(0 0) pool(1 1, 1 1)] P[  6] C[  8, 11]
  8 NN [(  56   56   24 1,    75264, 0x0xa8053b0(0x0x30953b0, 0x0x12600) ->   56   56  144 1,   451584, 0x0x2dca7a0(0x0x2dca7a0, 0x(nil))) k(1 1   24, 4352) pad(0 0) pool(1 1, 1 1)] P[  7] C[  9]
  9 NN [(  56   56  144 1,   451584, 0x0x2dca7a0(0x0x2dca7a0, 0x(nil)) ->   56   56  144 1,   451584, 0x0x2dce100(0x0x2dce100, 0x(nil))) k(3 3  144, 231936) pad(1 1) pool(1 1, 1 1)] P[  8] C[ 10]
 10 NN [(  56   56  144 1,   451584, 0x0x2dce100(0x0x2dce100, 0x(nil)) ->   56   56   24 1,    75264, 0x0x30953b0(0x0x30953b0, 0x(nil))) k(1 1  144, 3840) pad(0 0) pool(1 1, 1 1)] P[  9] C[ 11]
 11 NN [( 128  588    2 1,   150528, 0x0x30953b0(0x0x30953b0, 0x(nil)) ->  128  588    1 1,    75264, 0x0x2dd53b0(0x0x2dd53b0, 0x(nil))) k(1 1    2, 128) pad(0 0) pool(1 1, 1 1)] P[  7, 10] C[ 12]
 12 NN [(  56   56   24 1,    75264, 0x0x2dd53b0(0x0x2dd53b0, 0x(nil)) ->   56   56  144 1,   451584, 0x0x2dd63c0(0x0x2dd63c0, 0x(nil))) k(1 1   24, 4352) pad(0 0) pool(1 1, 1 1)] P[ 11] C[ 13]
 13 NN [(  56   56  144 1,   451584, 0x0x2dd63c0(0x0x2dd63c0, 0x(nil)) ->   28   28  144 1,   112896, 0x0x2dd9e30(0x0x2dd9e30, 0x(nil))) k(3 3  144, 231808) pad(0 0) pool(2 2, 2 2)] P[ 12] C[ 14]
 14 NN [(  28   28  144 1,   112896, 0x0x2dd9e30(0x0x2dd9e30, 0x(nil)) ->   28   28   32 1,    25088, 0x0x586b1b0(0x0x309b1b0, 0x0x6200)) k(1 1  144, 4992) pad(0 0) pool(1 1, 1 1)] P[ 13] C[ 15, 18]
 15 NN [(  28   28   32 1,    25088, 0x0x586b1b0(0x0x309b1b0, 0x0x6200) ->   28   28  192 1,   150528, 0x0x2de1370(0x0x2de1370, 0x(nil))) k(1 1   32, 7296) pad(0 0) pool(1 1, 1 1)] P[ 14] C[ 16]
 16 NN [(  28   28  192 1,   150528, 0x0x2de1370(0x0x2de1370, 0x(nil)) ->   28   28  192 1,   150528, 0x0x2de4e20(0x0x2de4e20, 0x(nil))) k(3 3  192, 412544) pad(1 1) pool(1 1, 1 1)] P[ 15] C[ 17]
 17 NN [(  28   28  192 1,   150528, 0x0x2de4e20(0x0x2de4e20, 0x(nil)) ->   28   28   32 1,    25088, 0x0x309b1b0(0x0x309b1b0, 0x(nil))) k(1 1  192, 6656) pad(0 0) pool(1 1, 1 1)] P[ 16] C[ 18]
 18 NN [( 128  196    2 1,    50176, 0x0x309b1b0(0x0x309b1b0, 0x(nil)) ->  128  196    1 1,    25088, 0x0x5871110(0x0x30a1110, 0x0x6200)) k(1 1    2, 128) pad(0 0) pool(1 1, 1 1)] P[ 14, 17] C[ 19, 22]
 19 NN [(  28   28   32 1,    25088, 0x0x5871110(0x0x30a1110, 0x0x6200) ->   28   28  192 1,   150528, 0x0x2ded700(0x0x2ded700, 0x(nil))) k(1 1   32, 7296) pad(0 0) pool(1 1, 1 1)] P[ 18] C[ 20]
 20 NN [(  28   28  192 1,   150528, 0x0x2ded700(0x0x2ded700, 0x(nil)) ->   28   28  192 1,   150528, 0x0x2df0d50(0x0x2df0d50, 0x(nil))) k(3 3  192, 412544) pad(1 1) pool(1 1, 1 1)] P[ 19] C[ 21]
 21 NN [(  28   28  192 1,   150528, 0x0x2df0d50(0x0x2df0d50, 0x(nil)) ->   28   28   32 1,    25088, 0x0x30a1110(0x0x30a1110, 0x(nil))) k(1 1  192, 6656) pad(0 0) pool(1 1, 1 1)] P[ 20] C[ 22]
 22 NN [( 128  196    2 1,    50176, 0x0x30a1110(0x0x30a1110, 0x(nil)) ->  128  196    1 1,    25088, 0x0x2df7bd0(0x0x2df7bd0, 0x(nil))) k(1 1    2, 128) pad(0 0) pool(1 1, 1 1)] P[ 18, 21] C[ 23]
 23 NN [(  28   28   32 1,    25088, 0x0x2df7bd0(0x0x2df7bd0, 0x(nil)) ->   28   28  192 1,   150528, 0x0x2df8cc0(0x0x2df8cc0, 0x(nil))) k(1 1   32, 7296) pad(0 0) pool(1 1, 1 1)] P[ 22] C[ 24]
 24 NN [(  28   28  192 1,   150528, 0x0x2df8cc0(0x0x2df8cc0, 0x(nil)) ->   14   14  192 1,    37632, 0x0x2dfcd30(0x0x2dfcd30, 0x(nil))) k(3 3  192, 412544) pad(0 0) pool(2 2, 2 2)] P[ 23] C[ 25]
 25 NN [(  14   14  192 1,    37632, 0x0x2dfcd30(0x0x2dfcd30, 0x(nil)) ->   14   14   64 1,    12544, 0x0x448f070(0x0x30a7070, 0x0x3100)) k(1 1  192, 13184) pad(0 0) pool(1 1, 1 1)] P[ 24] C[ 26, 29]
 26 NN [(  14   14   64 1,    12544, 0x0x448f070(0x0x30a7070, 0x0x3100) ->   14   14  384 1,    75264, 0x0x2e04ac0(0x0x2e04ac0, 0x(nil))) k(1 1   64, 27520) pad(0 0) pool(1 1, 1 1)] P[ 25] C[ 27]
 27 NN [(  14   14  384 1,    75264, 0x0x2e04ac0(0x0x2e04ac0, 0x(nil)) ->   14   14  384 1,    75264, 0x0x2e08a20(0x0x2e08a20, 0x(nil))) k(3 3  384, 1650176) pad(1 1) pool(1 1, 1 1)] P[ 26] C[ 28]
 28 NN [(  14   14  384 1,    75264, 0x0x2e08a20(0x0x2e08a20, 0x(nil)) ->   14   14   64 1,    12544, 0x0x30a7070(0x0x30a7070, 0x(nil))) k(1 1  384, 26112) pad(0 0) pool(1 1, 1 1)] P[ 27] C[ 29]
 29 NN [( 128   98    2 1,    25088, 0x0x30a7070(0x0x30a7070, 0x(nil)) ->  128   98    1 1,    12544, 0x0x4494fd0(0x0x30acfd0, 0x0x3100)) k(1 1    2, 128) pad(0 0) pool(1 1, 1 1)] P[ 25, 28] C[ 30, 33]
 30 NN [(  14   14   64 1,    12544, 0x0x4494fd0(0x0x30acfd0, 0x0x3100) ->   14   14  384 1,    75264, 0x0x2e11970(0x0x2e11970, 0x(nil))) k(1 1   64, 27520) pad(0 0) pool(1 1, 1 1)] P[ 29] C[ 31]
 31 NN [(  14   14  384 1,    75264, 0x0x2e11970(0x0x2e11970, 0x(nil)) ->   14   14  384 1,    75264, 0x0x2e15b30(0x0x2e15b30, 0x(nil))) k(3 3  384, 1650176) pad(1 1) pool(1 1, 1 1)] P[ 30] C[ 32]
 32 NN [(  14   14  384 1,    75264, 0x0x2e15b30(0x0x2e15b30, 0x(nil)) ->   14   14   64 1,    12544, 0x0x30acfd0(0x0x30acfd0, 0x(nil))) k(1 1  384, 26112) pad(0 0) pool(1 1, 1 1)] P[ 31] C[ 33]
 33 NN [( 128   98    2 1,    25088, 0x0x30acfd0(0x0x30acfd0, 0x(nil)) ->  128   98    1 1,    12544, 0x0x449af30(0x0x30b2f30, 0x0x3100)) k(1 1    2, 128) pad(0 0) pool(1 1, 1 1)] P[ 29, 32] C[ 34, 37]
 34 NN [(  14   14   64 1,    12544, 0x0x449af30(0x0x30b2f30, 0x0x3100) ->   14   14  384 1,    75264, 0x0x2efd220(0x0x2efd220, 0x(nil))) k(1 1   64, 27520) pad(0 0) pool(1 1, 1 1)] P[ 33] C[ 35]
 35 NN [(  14   14  384 1,    75264, 0x0x2efd220(0x0x2efd220, 0x(nil)) ->   14   14  384 1,    75264, 0x0x2f013b0(0x0x2f013b0, 0x(nil))) k(3 3  384, 1650176) pad(1 1) pool(1 1, 1 1)] P[ 34] C[ 36]
 36 NN [(  14   14  384 1,    75264, 0x0x2f013b0(0x0x2f013b0, 0x(nil)) ->   14   14   64 1,    12544, 0x0x30b2f30(0x0x30b2f30, 0x(nil))) k(1 1  384, 26112) pad(0 0) pool(1 1, 1 1)] P[ 35] C[ 37]
 37 NN [( 128   98    2 1,    25088, 0x0x30b2f30(0x0x30b2f30, 0x(nil)) ->  128   98    1 1,    12544, 0x0x2f09240(0x0x2f09240, 0x(nil))) k(1 1    2, 128) pad(0 0) pool(1 1, 1 1)] P[ 33, 36] C[ 38]
 38 NN [(  14   14   64 1,    12544, 0x0x2f09240(0x0x2f09240, 0x(nil)) ->   14   14  384 1,    75264, 0x0x2f0a330(0x0x2f0a330, 0x(nil))) k(1 1   64, 27520) pad(0 0) pool(1 1, 1 1)] P[ 37] C[ 39]
 39 NN [(  14   14  384 1,    75264, 0x0x2f0a330(0x0x2f0a330, 0x(nil)) ->   14   14  384 1,    75264, 0x0x2f0e4c0(0x0x2f0e4c0, 0x(nil))) k(3 3  384, 1650176) pad(1 1) pool(1 1, 1 1)] P[ 38] C[ 40]
 40 NN [(  14   14  384 1,    75264, 0x0x2f0e4c0(0x0x2f0e4c0, 0x(nil)) ->   14   14   96 1,    18816, 0x0x4e94eb0(0x0x30b8eb0, 0x0x4980)) k(1 1  384, 39168) pad(0 0) pool(1 1, 1 1)] P[ 39] C[ 41, 44]
 41 NN [(  14   14   96 1,    18816, 0x0x4e94eb0(0x0x30b8eb0, 0x0x4980) ->   14   14  576 1,   112896, 0x0x2f16370(0x0x2f16370, 0x(nil))) k(1 1   96, 60544) pad(0 0) pool(1 1, 1 1)] P[ 40] C[ 42]
 42 NN [(  14   14  576 1,   112896, 0x0x2f16370(0x0x2f16370, 0x(nil)) ->   14   14  576 1,   112896, 0x0x2f1a2a0(0x0x2f1a2a0, 0x(nil))) k(3 3  576, 3712896) pad(1 1) pool(1 1, 1 1)] P[ 41] C[ 43]
 43 NN [(  14   14  576 1,   112896, 0x0x2f1a2a0(0x0x2f1a2a0, 0x(nil)) ->   14   14   96 1,    18816, 0x0x30b8eb0(0x0x30b8eb0, 0x(nil))) k(1 1  576, 58496) pad(0 0) pool(1 1, 1 1)] P[ 42] C[ 44]
 44 NN [( 128  147    2 1,    37632, 0x0x30b8eb0(0x0x30b8eb0, 0x(nil)) ->  128  147    1 1,    18816, 0x0x4e9ae30(0x0x30bee30, 0x0x4980)) k(1 1    2, 128) pad(0 0) pool(1 1, 1 1)] P[ 40, 43] C[ 45, 48]
 45 NN [(  14   14   96 1,    18816, 0x0x4e9ae30(0x0x30bee30, 0x0x4980) ->   14   14  576 1,   112896, 0x0x2f23140(0x0x2f23140, 0x(nil))) k(1 1   96, 60544) pad(0 0) pool(1 1, 1 1)] P[ 44] C[ 46]
 46 NN [(  14   14  576 1,   112896, 0x0x2f23140(0x0x2f23140, 0x(nil)) ->   14   14  576 1,   112896, 0x0x2f273a0(0x0x2f273a0, 0x(nil))) k(3 3  576, 3712896) pad(1 1) pool(1 1, 1 1)] P[ 45] C[ 47]
 47 NN [(  14   14  576 1,   112896, 0x0x2f273a0(0x0x2f273a0, 0x(nil)) ->   14   14   96 1,    18816, 0x0x30bee30(0x0x30bee30, 0x(nil))) k(1 1  576, 58496) pad(0 0) pool(1 1, 1 1)] P[ 46] C[ 48]
 48 NN [( 128  147    2 1,    37632, 0x0x30bee30(0x0x30bee30, 0x(nil)) ->  128  147    1 1,    18816, 0x0x2f2f230(0x0x2f2f230, 0x(nil))) k(1 1    2, 128) pad(0 0) pool(1 1, 1 1)] P[ 44, 47] C[ 49]
 49 NN [(  14   14   96 1,    18816, 0x0x2f2f230(0x0x2f2f230, 0x(nil)) ->   14   14  576 1,   112896, 0x0x2f30240(0x0x2f30240, 0x(nil))) k(1 1   96, 60544) pad(0 0) pool(1 1, 1 1)] P[ 48] C[ 50]
 50 NN [(  14   14  576 1,   112896, 0x0x2f30240(0x0x2f30240, 0x(nil)) ->    7    7  576 1,    28224, 0x0x2f344c0(0x0x2f344c0, 0x(nil))) k(3 3  576, 3712768) pad(0 0) pool(2 2, 2 2)] P[ 49] C[ 51]
 51 NN [(   7    7  576 1,    28224, 0x0x2f344c0(0x0x2f344c0, 0x(nil)) ->    7    7  160 1,     7840, 0x0x3d35db0(0x0x30c4db0, 0x0x1ea0)) k(1 1  576, 97536) pad(0 0) pool(1 1, 1 1)] P[ 50] C[ 52, 55]
 52 NN [(   7    7  160 1,     7840, 0x0x3d35db0(0x0x30c4db0, 0x0x1ea0) ->    7    7  960 1,    47040, 0x0x2f3c370(0x0x2f3c370, 0x(nil))) k(1 1  160, 165376) pad(0 0) pool(1 1, 1 1)] P[ 51] C[ 53]
 53 NN [(   7    7  960 1,    47040, 0x0x2f3c370(0x0x2f3c370, 0x(nil)) ->    7    7  960 1,    47040, 0x0x2f402d0(0x0x2f402d0, 0x(nil))) k(3 3  960, 10315648) pad(1 1) pool(1 1, 1 1)] P[ 52] C[ 54]
 54 NN [(   7    7  960 1,    47040, 0x0x2f402d0(0x0x2f402d0, 0x(nil)) ->    7    7  160 1,     7840, 0x0x30c4db0(0x0x30c4db0, 0x(nil))) k(1 1  960, 162048) pad(0 0) pool(1 1, 1 1)] P[ 53] C[ 55]
 55 NN [(  32  245    2 1,    15680, 0x0x30c4db0(0x0x30c4db0, 0x(nil)) ->   32  245    1 1,     7840, 0x0x3d3bd20(0x0x30cad20, 0x0x1ea0)) k(1 1    2, 128) pad(0 0) pool(1 1, 1 1)] P[ 51, 54] C[ 56, 59]
 56 NN [(   7    7  160 1,     7840, 0x0x3d3bd20(0x0x30cad20, 0x0x1ea0) ->    7    7  960 1,    47040, 0x0x2f49260(0x0x2f49260, 0x(nil))) k(1 1  160, 165376) pad(0 0) pool(1 1, 1 1)] P[ 55] C[ 57]
 57 NN [(   7    7  960 1,    47040, 0x0x2f49260(0x0x2f49260, 0x(nil)) ->    7    7  960 1,    47040, 0x0x2f4d400(0x0x2f4d400, 0x(nil))) k(3 3  960, 10315648) pad(1 1) pool(1 1, 1 1)] P[ 56] C[ 58]
 58 NN [(   7    7  960 1,    47040, 0x0x2f4d400(0x0x2f4d400, 0x(nil)) ->    7    7  160 1,     7840, 0x0x30cad20(0x0x30cad20, 0x(nil))) k(1 1  960, 162048) pad(0 0) pool(1 1, 1 1)] P[ 57] C[ 59]
 59 NN [(  32  245    2 1,    15680, 0x0x30cad20(0x0x30cad20, 0x(nil)) ->   32  245    1 1,     7840, 0x0x2f552b0(0x0x2f552b0, 0x(nil))) k(1 1    2, 128) pad(0 0) pool(1 1, 1 1)] P[ 55, 58] C[ 60]
 60 NN [(   7    7  160 1,     7840, 0x0x2f552b0(0x0x2f552b0, 0x(nil)) ->    7    7  960 1,    47040, 0x0x2f562c0(0x0x2f562c0, 0x(nil))) k(1 1  160, 165376) pad(0 0) pool(1 1, 1 1)] P[ 59] C[ 61]
 61 NN [(   7    7  960 1,    47040, 0x0x2f562c0(0x0x2f562c0, 0x(nil)) ->    7    7  960 1,    47040, 0x0x2f5a550(0x0x2f5a550, 0x(nil))) k(3 3  960, 10315648) pad(1 1) pool(1 1, 1 1)] P[ 60] C[ 62]
 62 NN [(   7    7  960 1,    47040, 0x0x2f5a550(0x0x2f5a550, 0x(nil)) ->    7    7  320 1,    15680, 0x0x2f5e550(0x0x2f5e550, 0x(nil))) k(1 1  960, 323968) pad(0 0) pool(1 1, 1 1)] P[ 61] C[ 63]
 63 NN [(   7    7  320 1,    15680, 0x0x2f5e550(0x0x2f5e550, 0x(nil)) ->    7    7 1280 1,    62720, 0x0x2f62400(0x0x2f62400, 0x(nil))) k(1 1  320, 435456) pad(0 0) pool(1 1, 1 1)] P[ 62] C[ 64]
 64 SH [(   7    7 1280 1,    62720, 0x0x2f62400(0x0x2f62400, 0x(nil)) ->    1    1 1280 1,     1280, 0x0x2f66360(0x0x2f66360, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 63] C[ 65]
 65 TP [(1280    1    1 1,     1280, 0x0x2f66360(0x0x2f66360, 0x(nil)) -> 1001    1    1 1,     1001, 0x0x2db2990(0x0x2db2990, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(1 1, 1 1)] P[ 64] C[ 66]
 66 SH [(1001    1    1 1,     1001, 0x0x2db2990(0x0x2db2990, 0x(nil)) -> 1001    1    1 1,     1001, 0x0x2db1ae0(0x0x2db1ae0, 0x(nil))) k(0 0    0, 0) pad(0 0) pool(0 0, 1 1)] P[ 65]

 id IN [ x  y  w   h ]   OUT  [ x  y  w  h ] (tx, ty, kpc) (ic, kc, kc/ks, ks/eks, kernel_type) NNT(in, out)

 id | opid IN [ x  y  w   h ]   OUT  [ x  y  w  h ] (tx, ty, kpc) (ic, kc, kc/ks, ks/eks, kernel_type) NNT(in, out)
  0 |   0 TP DD 0x0 [   0    0        3      224] -> DD 0x0 [   0    0      224      224] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
  1 |   1 TP DD 0x0 [   0    0      224      224] -> DD 0x0 [   0    0      113      113] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
  2 |   2 NN DD 0x0 [   0    0      113      113] -> DD 0x0 [   0    0      112      112] ( 56,   2,   4) (    2176,     1536, 100.00%, 85.71%, DD) (       0,        0)
  3 |   3 NN DD 0x0 [   0    0      112      112] -> DD 0x0 [   0    0      112      112] ( 56,  10,   4) (   22528,     1536, 100.00%, 13.48%, DD) (       0,        0)
  4 |   4 NN DD 0x0 [   0    0      112      112] -> DD 0x0 [   0    0      112      112] ( 56,   2,   2) (    3584,     1024, 100.00%, 160.00%, DD) (       0,        0)
  5 |   5 NN DD 0x0 [   0    0      112      112] -> DD 0x0 [   0    0      112      112] ( 32,   1,  12) (     512,     2560, 100.00%, 125.00%, DD) (       0,        0)
  6 |   6 NN DD 0x0 [   0    0      112      112] -> DD 0x0 [   0    0       56       56] ( 56,  10,   6) (   67584,     6656, 100.00%, 6.46%, DD) (       0,        0)
  7 |   7 NN DD 0x0 [   0    0       56       56] -> DD 0x0 [   0    0       56       56] ( 56,   8,   3) (   43008,     2560, 100.00%, 100.00%, DD) (       0,        0)
  8 |   8 NN DD 0x0 [   0    0       56       56] -> DD 0x0 [   0    0       56       56] ( 56,   8,   6) (   10752,     5120, 100.00%, 117.65%, DD) (       0,        0)
  9 |   9 NN DD 0x0 [   0    0       56       56] -> DD 0x0 [   0    0       56       56] ( 56,   8,   6) (   85248,    12800, 100.00%, 5.52%, DD) (       0,        0)
 10 |  10 NN DD 0x0 [   0    0       56       56] -> DD 0x0 [   0    0       56       56] ( 56,   8,   3) (   64512,     4096, 100.00%, 106.67%, DD) (       0,        0)
 11 |  11 NN DD 0x0 [   0    0      128      588] -> DD 0x0 [   0    0      128      588] ( 64,  12,   1) (    1536,      512, 100.00%, 400.00%, DD) (       0,        0)
 12 |  12 NN DD 0x0 [   0    0       56       56] -> DD 0x0 [   0    0       56       56] ( 56,   8,   6) (   10752,     5120, 100.00%, 117.65%, DD) (       0,        0)
 13 |  13 NN DD 0x0 [   0    0       56       56] -> DD 0x0 [   0    0       28       28] ( 56,  10,   6) (  101376,    13312, 100.00%, 5.74%, DD) (       0,        0)
 14 |  14 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   4) (   57600,     5120, 100.00%, 102.56%, DD) (       0,        0)
 15 |  15 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   8) (   12800,     8192, 100.00%, 112.28%, DD) (       0,        0)
 16 |  16 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   8) (   92160,    21504, 100.00%, 5.23%, DD) (       0,        0)
 17 |  17 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   4) (   76800,     6656, 100.00%, 100.00%, DD) (       0,        0)
 18 |  18 NN DD 0x0 [   0    0      128      196] -> DD 0x0 [   0    0      128      196] ( 64,   7,   1) (     896,      512, 100.00%, 400.00%, DD) (       0,        0)
 19 |  19 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   8) (   12800,     8192, 100.00%, 112.28%, DD) (       0,        0)
 20 |  20 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   8) (   92160,    21504, 100.00%, 5.22%, DD) (       0,        0)
 21 |  21 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   4) (   76800,     6656, 100.00%, 100.00%, DD) (       0,        0)
 22 |  22 NN DD 0x0 [   0    0      128      196] -> DD 0x0 [   0    0      128      196] ( 64,   7,   1) (     896,      512, 100.00%, 400.00%, DD) (       0,        0)
 23 |  23 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       28       28] ( 28,  14,   8) (   12800,     8192, 100.00%, 112.28%, DD) (       0,        0)
 24 |  24 NN DD 0x0 [   0    0       28       28] -> DD 0x0 [   0    0       14       14] ( 28,  16,   8) (  104448,    21504, 100.00%, 5.23%, DD) (       0,        0)
 25 |  25 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   8) (   39936,    13312, 100.00%, 100.97%, DD) (       0,        0)
 26 |  26 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (   13312,    28160, 100.00%, 102.33%, DD) (       0,        0)
 27 |  27 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (   98304,    76800, 100.00%, 4.66%, DD) (       0,        0)
 28 |  28 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   8) (   79872,    25600, 100.00%, 98.04%, DD) (       0,        0)
 29 |  29 NN DD 0x0 [   0    0      128       98] -> DD 0x0 [   0    0      128       98] ( 64,   7,   1) (     896,      512, 100.00%, 400.00%, DD) (       0,        0)
 30 |  30 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (   13312,    28160, 100.00%, 102.33%, DD) (       0,        0)
 31 |  31 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (   98304,    76800, 100.00%, 4.66%, DD) (       0,        0)
 32 |  32 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   8) (   79872,    25600, 100.00%, 98.04%, DD) (       0,        0)
 33 |  33 NN DD 0x0 [   0    0      128       98] -> DD 0x0 [   0    0      128       98] ( 64,   7,   1) (     896,      512, 100.00%, 400.00%, DD) (       0,        0)
 34 |  34 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (   13312,    28160, 100.00%, 102.33%, DD) (       0,        0)
 35 |  35 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (   98304,    76800, 100.00%, 4.66%, DD) (       0,        0)
 36 |  36 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,   8) (   79872,    25600, 100.00%, 98.04%, DD) (       0,        0)
 37 |  37 NN DD 0x0 [   0    0      128       98] -> DD 0x0 [   0    0      128       98] ( 64,   7,   1) (     896,      512, 100.00%, 400.00%, DD) (       0,        0)
 38 |  38 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (   13312,    28160, 100.00%, 102.33%, DD) (       0,        0)
 39 |  39 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  16) (   98304,    76288, 100.00%, 4.63%, DD) (       0,        0)
 40 |  40 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  12) (   79872,    37888, 100.00%, 96.73%, DD) (       0,        0)
 41 |  41 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  18) (   19968,    60416, 100.00%, 99.79%, DD) (       0,        0)
 42 |  42 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  18) (  147456,   165376, 100.00%, 4.46%, DD) (       0,        0)
 43 |  43 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  12) (  119808,    56320, 100.00%, 96.28%, DD) (       0,        0)
 44 |  44 NN DD 0x0 [   0    0      128      147] -> DD 0x0 [   0    0      128      147] ( 64,   7,   1) (     896,      512, 100.00%, 400.00%, DD) (       0,        0)
 45 |  45 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  18) (   19968,    60416, 100.00%, 99.79%, DD) (       0,        0)
 46 |  46 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  18) (  147456,   165888, 100.00%, 4.47%, DD) (       0,        0)
 47 |  47 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  12) (  119808,    56320, 100.00%, 96.28%, DD) (       0,        0)
 48 |  48 NN DD 0x0 [   0    0      128      147] -> DD 0x0 [   0    0      128      147] ( 64,   7,   1) (     896,      512, 100.00%, 400.00%, DD) (       0,        0)
 49 |  49 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0       14       14] ( 14,  14,  18) (   19968,    60416, 100.00%, 99.79%, DD) (       0,        0)
 50 |  50 NN DD 0x0 [   0    0       14       14] -> DD 0x0 [   0    0        7        7] ( 14,  14,  18) (  147456,   165888, 100.00%, 4.47%, DD) (       0,        0)
 51 |  51 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  20) (   36864,    93696, 100.00%, 96.06%, DD) (       0,        0)
 52 |  52 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  30) (   10240,   161792, 100.00%, 97.83%, DD) (       0,        0)
 53 |  53 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  30) (   92160,   430080, 96.89%, 4.31%, DD) (       0,        0)
 54 |  54 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  20) (   61440,   155136, 100.00%, 95.73%, DD) (       0,        0)
 55 |  55 NN DD 0x0 [   0    0       32      245] -> DD 0x0 [   0    0       32      245] ( 32,  18,   1) (    1152,      512, 100.00%, 400.00%, DD) (       0,        0)
 56 |  56 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  30) (   10240,   161792, 100.00%, 97.83%, DD) (       0,        0)
 57 |  57 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  30) (   92160,   430080, 96.89%, 4.31%, DD) (       0,        0)
 58 |  58 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  20) (   61440,   155136, 100.00%, 95.73%, DD) (       0,        0)
 59 |  59 NN DD 0x0 [   0    0       32      245] -> DD 0x0 [   0    0       32      245] ( 32,  18,   1) (    1152,      512, 100.00%, 400.00%, DD) (       0,        0)
 60 |  60 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  30) (   10240,   161792, 100.00%, 97.83%, DD) (       0,        0)
 61 |  61 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  30) (   92160,   430080, 96.77%, 4.31%, DD) (       0,        0)
 62 |  62 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  20) (   61440,   310272, 100.00%, 95.77%, DD) (       0,        0)
 63 |  63 NN DD 0x0 [   0    0        7        7] -> DD 0x0 [   0    0        7        7] (  7,   7,  32) (   20480,   420352, 100.00%, 96.53%, DD) (       0,        0)
 64 |  64 SH DD 0x0 [   0    0        0        0] -> DD 0x0 [   0    0        0        0] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 65 |  65 TP DD 0x0 [   0    0     1280        1] -> DD 0x0 [   0    0     1001        1] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)
 66 |  66 SH DD 0x0 [   0    0        0        0] -> DD 0x0 [   0    0        0        0] (  0,   0,   0) (       0,        0, 0.00%, 0.00%, NONE) (       0,        0)

PreLoadWeightBiases = 1048320  100.000000%
---------------------------End VerifyTiling -------------------------
KernelStreamSize: 0x583500, statesSize: 0x1c00, shShareMemSize: 0x0, shIntrSize: 0x700, shParaSize: 0x440, swParaSize: 0x0, lcdTensorSize: 0x0, shaderStatesSize: 0x9c0, tensorStatic: 0x0
NBG: operationSize: 0x86c, nnSize: 0x1f80, tpSize: 0x780, shSize: 0x10, swSize: 0x0, layerParamSize: 0x0, lcdtSize: 0x4a8, patchSize: 0x3de0, icdtSize: 0xe8 hwInitOpSize: 0x24, lcdSize 0x585d40
NBG: entranceSize: 0x208, nbIOSize: 0xe8, layeSize: 0x1398, sectionsSize: 0x7310, inputoutput size: 0x24fe9, InitCommands size: 0x1104
NBG: lcdSize: 0x585d40, headerSize : 0x8998
Calculate NBG size : 5830876 bytes
generate NBG into memory start.
vxoBinaryGraph_SaveBinaryEntrance[20461]: collect input count=0, output count=0
vxoBinaryGraph_SaveBinaryEntrance[20531]: total operation count=67
generate NBG, device count=1, core count per-device: 1,
vxoBinaryGraph_RefineInputOutput:11143 input table address: 0x15469c0
vxoBinaryGraph_RefineInputOutput:11149 output table address: 0x18d4a80
vxoBinaryGraph_SaveBinaryEntranceExt[19524]: graph->inputCount=1, graph->outputCount=1, refine inputCount=1, outputCount=1
NBG network name field : dummy_network_name
vxoBinaryGraph_SaveBinaryEntranceExt[20127]: header input count=1, output count=1
generate NBG, save initialize commands
vxoBinaryGraph_ReSaveInputAndPatchTable[17202]: re-save operation count=74
Generate NBG in memory Actual NBG size : 5822976 bytes
generate NBG into memory successfully.
VsiNpuModule::GetFunction: get_symbol
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: get_const_vars
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: get_const_vars
VsiNpuModule::GetFunction: return early
=======imported_modules======== [Module(llvm, 4fb2c68), Module(vsi_npu, 5233ee8)]
=======imported_modules[0]======== ; ModuleID = 'empty_module'
source_filename = "empty_module"
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-linux-gnu"

VsiNpuModule::SaveToBinary
SaveToBinary: nbg size = 5822976
SaveToBinary: input size = 1
SaveToBinary: output size = 1
VsiNpuModule : SerializeTensorSpec
VsiNpuModule : SerializeTensorSpec2
VsiNpuModule : SerializeTensorSpec
VsiNpuModule : SerializeTensorSpec2
VsiNpuModule::SaveToBinary2
[[  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   1   0   0   0   0   0
    0   0   0   0   0   0   0   0   1   1   0   0 112  66   7   1  24   0
   10   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   1   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0]]

Khadas VIM3 pro

khadas@Khadas:~$ python3 -m tvm.exec.rpc_server --host 0.0.0.0 --port=9090
INFO:root:If you are running ROCM/Metal, fork will cause compiler internal error. Try to launch with arg ```--no-fork```
INFO:RPCServer:bind to 0.0.0.0:9090
INFO:RPCServer:connection from ('192.168.137.177', 34272)
VsiNpuModule::LoadFromBinary
LoadFromBinary: nbg size = 5822976
LoadFromBinary: input size = 1
LoadFromBinary: output size = 1
VsiNpuModule : DeSerializeTensorSpec
VsiNpuModule : DeSerializeTensorSpec2
VsiNpuModule : DeSerializeTensorSpec
VsiNpuModule : DeSerializeTensorSpec2
INFO:RPCServer:load_module /tmp/tmpum5rchg2/lib.so
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: tvmgen_default_vsi_npu_0
[     1] PLS isn't existed
Process Graph: 7 ms or 7670 us
VsiNpuModule::GetFunction: size: 2
INFO:RPCServer:Finish serving ('192.168.137.177', 34272)

This is the output of local compilation on Khadas VIM3 Pro, using the quantized tflite model MobileNetV2.

Details

khadas@Khadas:~/$ python3 test_vsi_tflite_model_all.py 
#[version = "0.0.5"]
def @main(%input: Tensor[(1, 224, 224, 3), uint8], %v_param_1: Tensor[(3, 3, 3, 32), uint8], %v_param_2: Tensor[(32), int32], %v_param_3: Tensor[(3, 3, 32, 1), uint8], %v_param_4: Tensor[(32), int32], %v_param_5: Tensor[(1, 1, 32, 16), uint8], %v_param_6: Tensor[(16), int32], %v_param_7: Tensor[(1, 1, 16, 96), uint8], %v_param_8: Tensor[(96), int32], %v_param_9: Tensor[(3, 3, 96, 1), uint8], %v_param_10: Tensor[(96), int32], %v_param_11: Tensor[(1, 1, 96, 24), uint8], %v_param_12: Tensor[(24), int32], %v_param_13: Tensor[(1, 1, 24, 144), uint8], %v_param_14: Tensor[(144), int32], %v_param_15: Tensor[(3, 3, 144, 1), uint8], %v_param_16: Tensor[(144), int32], %v_param_17: Tensor[(1, 1, 144, 24), uint8], %v_param_18: Tensor[(24), int32], %v_param_19: Tensor[(1, 1, 24, 144), uint8], %v_param_20: Tensor[(144), int32], %v_param_21: Tensor[(3, 3, 144, 1), uint8], %v_param_22: Tensor[(144), int32], %v_param_23: Tensor[(1, 1, 144, 32), uint8], %v_param_24: Tensor[(32), int32], %v_param_25: Tensor[(1, 1, 32, 192), uint8], %v_param_26: Tensor[(192), int32], %v_param_27: Tensor[(3, 3, 192, 1), uint8], %v_param_28: Tensor[(192), int32], %v_param_29: Tensor[(1, 1, 192, 32), uint8], %v_param_30: Tensor[(32), int32], %v_param_31: Tensor[(1, 1, 32, 192), uint8], %v_param_32: Tensor[(192), int32], %v_param_33: Tensor[(3, 3, 192, 1), uint8], %v_param_34: Tensor[(192), int32], %v_param_35: Tensor[(1, 1, 192, 32), uint8], %v_param_36: Tensor[(32), int32], %v_param_37: Tensor[(1, 1, 32, 192), uint8], %v_param_38: Tensor[(192), int32], %v_param_39: Tensor[(3, 3, 192, 1), uint8], %v_param_40: Tensor[(192), int32], %v_param_41: Tensor[(1, 1, 192, 64), uint8], %v_param_42: Tensor[(64), int32], %v_param_43: Tensor[(1, 1, 64, 384), uint8], %v_param_44: Tensor[(384), int32], %v_param_45: Tensor[(3, 3, 384, 1), uint8], %v_param_46: Tensor[(384), int32], %v_param_47: Tensor[(1, 1, 384, 64), uint8], %v_param_48: Tensor[(64), int32], %v_param_49: Tensor[(1, 1, 64, 384), uint8], %v_param_50: Tensor[(384), int32], %v_param_51: Tensor[(3, 3, 384, 1), uint8], %v_param_52: Tensor[(384), int32], %v_param_53: Tensor[(1, 1, 384, 64), uint8], %v_param_54: Tensor[(64), int32], %v_param_55: Tensor[(1, 1, 64, 384), uint8], %v_param_56: Tensor[(384), int32], %v_param_57: Tensor[(3, 3, 384, 1), uint8], %v_param_58: Tensor[(384), int32], %v_param_59: Tensor[(1, 1, 384, 64), uint8], %v_param_60: Tensor[(64), int32], %v_param_61: Tensor[(1, 1, 64, 384), uint8], %v_param_62: Tensor[(384), int32], %v_param_63: Tensor[(3, 3, 384, 1), uint8], %v_param_64: Tensor[(384), int32], %v_param_65: Tensor[(1, 1, 384, 96), uint8], %v_param_66: Tensor[(96), int32], %v_param_67: Tensor[(1, 1, 96, 576), uint8], %v_param_68: Tensor[(576), int32], %v_param_69: Tensor[(3, 3, 576, 1), uint8], %v_param_70: Tensor[(576), int32], %v_param_71: Tensor[(1, 1, 576, 96), uint8], %v_param_72: Tensor[(96), int32], %v_param_73: Tensor[(1, 1, 96, 576), uint8], %v_param_74: Tensor[(576), int32], %v_param_75: Tensor[(3, 3, 576, 1), uint8], %v_param_76: Tensor[(576), int32], %v_param_77: Tensor[(1, 1, 576, 96), uint8], %v_param_78: Tensor[(96), int32], %v_param_79: Tensor[(1, 1, 96, 576), uint8], %v_param_80: Tensor[(576), int32], %v_param_81: Tensor[(3, 3, 576, 1), uint8], %v_param_82: Tensor[(576), int32], %v_param_83: Tensor[(1, 1, 576, 160), uint8], %v_param_84: Tensor[(160), int32], %v_param_85: Tensor[(1, 1, 160, 960), uint8], %v_param_86: Tensor[(960), int32], %v_param_87: Tensor[(3, 3, 960, 1), uint8], %v_param_88: Tensor[(960), int32], %v_param_89: Tensor[(1, 1, 960, 160), uint8], %v_param_90: Tensor[(160), int32], %v_param_91: Tensor[(1, 1, 160, 960), uint8], %v_param_92: Tensor[(960), int32], %v_param_93: Tensor[(3, 3, 960, 1), uint8], %v_param_94: Tensor[(960), int32], %v_param_95: Tensor[(1, 1, 960, 160), uint8], %v_param_96: Tensor[(160), int32], %v_param_97: Tensor[(1, 1, 160, 960), uint8], %v_param_98: Tensor[(960), int32], %v_param_99: Tensor[(3, 3, 960, 1), uint8], %v_param_100: Tensor[(960), int32], %v_param_101: Tensor[(1, 1, 960, 320), uint8], %v_param_102: Tensor[(320), int32], %v_param_103: Tensor[(1, 1, 320, 1280), uint8], %v_param_104: Tensor[(1280), int32], %v_param_105: Tensor[(1, 1, 1280, 1001), uint8], %v_param_106: Tensor[(1001), int32]) {
  %0 = qnn.conv2d(%input, %v_param_1, 128, 115, 0.00787402f, 0.0287749f, strides=[2, 2], padding=[0, 0, 1, 1], channels=32, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %1 = nn.bias_add(%0, %v_param_2, axis=3);
  %2 = qnn.requantize(%1, 0.000226574f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %3 = qnn.conv2d(%2, %v_param_3, 0, 165, 0.0235285f, 0.343696f, padding=[1, 1, 1, 1], groups=32, channels=32, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %4 = nn.bias_add(%3, %v_param_4, axis=3);
  %5 = qnn.requantize(%4, 0.00808663f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %6 = qnn.conv2d(%5, %v_param_5, 0, 141, 0.0235285f, 0.0381986f, padding=[0, 0, 0, 0], channels=16, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %7 = nn.bias_add(%6, %v_param_6, axis=3);
  %8 = qnn.requantize(%7, 0.000898756f, 0, 0.362873f, 122, axis=3, out_dtype="uint8");
  %9 = qnn.conv2d(%8, %v_param_7, 122, 127, 0.362873f, 0.00954309f, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %10 = nn.bias_add(%9, %v_param_8, axis=3);
  %11 = qnn.requantize(%10, 0.00346293f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %12 = qnn.conv2d(%11, %v_param_9, 0, 109, 0.0235285f, 0.0194444f, strides=[2, 2], padding=[0, 0, 1, 1], groups=96, channels=96, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %13 = nn.bias_add(%12, %v_param_10, axis=3);
  %14 = qnn.requantize(%13, 0.000457496f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %15 = qnn.conv2d(%14, %v_param_11, 0, 152, 0.0235285f, 0.0225397f, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %16 = nn.bias_add(%15, %v_param_12, axis=3);
  %17 = qnn.requantize(%16, 0.000530324f, 0, 0.282426f, 122, axis=3, out_dtype="uint8");
  %18 = qnn.conv2d(%17, %v_param_13, 122, 145, 0.282426f, 0.00369501f, padding=[0, 0, 0, 0], channels=144, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %19 = nn.bias_add(%18, %v_param_14, axis=3);
  %20 = qnn.requantize(%19, 0.00104357f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %21 = qnn.conv2d(%20, %v_param_15, 0, 52, 0.0235285f, 0.169819f, padding=[1, 1, 1, 1], groups=144, channels=144, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %22 = nn.bias_add(%21, %v_param_16, axis=3);
  %23 = qnn.requantize(%22, 0.00399559f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %24 = qnn.conv2d(%23, %v_param_17, 0, 122, 0.0235285f, 0.026759f, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %25 = nn.bias_add(%24, %v_param_18, axis=3);
  %26 = qnn.requantize(%25, 0.000629599f, 0, 0.410429f, 137, axis=3, out_dtype="uint8");
  %27 = qnn.add(%26, %17, 0.410429f, 137, 0.282426f, 122, 0.448443f, 130);
  %28 = qnn.conv2d(%27, %v_param_19, 130, 104, 0.448443f, 0.0029434f, padding=[0, 0, 0, 0], channels=144, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %29 = nn.bias_add(%28, %v_param_20, axis=3);
  %30 = qnn.requantize(%29, 0.00131995f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %31 = qnn.conv2d(%30, %v_param_21, 0, 144, 0.0235285f, 0.0171147f, strides=[2, 2], padding=[0, 0, 1, 1], groups=144, channels=144, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %32 = nn.bias_add(%31, %v_param_22, axis=3);
  %33 = qnn.requantize(%32, 0.000402683f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %34 = qnn.conv2d(%33, %v_param_23, 0, 114, 0.0235285f, 0.016776f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %35 = nn.bias_add(%34, %v_param_24, axis=3);
  %36 = qnn.requantize(%35, 0.000394715f, 0, 0.224783f, 128, axis=3, out_dtype="uint8");
  %37 = qnn.conv2d(%36, %v_param_25, 128, 122, 0.224783f, 0.00210703f, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %38 = nn.bias_add(%37, %v_param_26, axis=3);
  %39 = qnn.requantize(%38, 0.000473626f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %40 = qnn.conv2d(%39, %v_param_27, 0, 111, 0.0235285f, 0.0671548f, padding=[1, 1, 1, 1], groups=192, channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %41 = nn.bias_add(%40, %v_param_28, axis=3);
  %42 = qnn.requantize(%41, 0.00158005f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %43 = qnn.conv2d(%42, %v_param_29, 0, 148, 0.0235285f, 0.0199821f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %44 = nn.bias_add(%43, %v_param_30, axis=3);
  %45 = qnn.requantize(%44, 0.000470149f, 0, 0.231107f, 120, axis=3, out_dtype="uint8");
  %46 = qnn.add(%45, %36, 0.231107f, 120, 0.224783f, 128, 0.271938f, 130);
  %47 = qnn.conv2d(%46, %v_param_31, 130, 119, 0.271938f, 0.00149126f, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %48 = nn.bias_add(%47, %v_param_32, axis=3);
  %49 = qnn.requantize(%48, 0.00040553f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %50 = qnn.conv2d(%49, %v_param_33, 0, 89, 0.0235285f, 0.0805961f, padding=[1, 1, 1, 1], groups=192, channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %51 = nn.bias_add(%50, %v_param_34, axis=3);
  %52 = qnn.requantize(%51, 0.0018963f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %53 = qnn.conv2d(%52, %v_param_35, 0, 127, 0.0235285f, 0.018966f, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %54 = nn.bias_add(%53, %v_param_36, axis=3);
  %55 = qnn.requantize(%54, 0.00044624f, 0, 0.268485f, 124, axis=3, out_dtype="uint8");
  %56 = qnn.add(%55, %46, 0.268485f, 124, 0.271938f, 130, 0.349583f, 124);
  %57 = qnn.conv2d(%56, %v_param_37, 124, 129, 0.349583f, 0.00188541f, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %58 = nn.bias_add(%57, %v_param_38, axis=3);
  %59 = qnn.requantize(%58, 0.000659109f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %60 = qnn.conv2d(%59, %v_param_39, 0, 129, 0.0235285f, 0.00993869f, strides=[2, 2], padding=[0, 0, 1, 1], groups=192, channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %61 = nn.bias_add(%60, %v_param_40, axis=3);
  %62 = qnn.requantize(%61, 0.000233842f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %63 = qnn.conv2d(%62, %v_param_41, 0, 144, 0.0235285f, 0.0145759f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %64 = nn.bias_add(%63, %v_param_42, axis=3);
  %65 = qnn.requantize(%64, 0.000342948f, 0, 0.193133f, 125, axis=3, out_dtype="uint8");
  %66 = qnn.conv2d(%65, %v_param_43, 125, 126, 0.193133f, 0.00157124f, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %67 = nn.bias_add(%66, %v_param_44, axis=3);
  %68 = qnn.requantize(%67, 0.000303459f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %69 = qnn.conv2d(%68, %v_param_45, 0, 105, 0.0235285f, 0.0612184f, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %70 = nn.bias_add(%69, %v_param_46, axis=3);
  %71 = qnn.requantize(%70, 0.00144038f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %72 = qnn.conv2d(%71, %v_param_47, 0, 127, 0.0235285f, 0.0187498f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %73 = nn.bias_add(%72, %v_param_48, axis=3);
  %74 = qnn.requantize(%73, 0.000441155f, 0, 0.180298f, 108, axis=3, out_dtype="uint8");
  %75 = qnn.add(%74, %65, 0.180298f, 108, 0.193133f, 125, 0.197618f, 120);
  %76 = qnn.conv2d(%75, %v_param_49, 120, 135, 0.197618f, 0.00145681f, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %77 = nn.bias_add(%76, %v_param_50, axis=3);
  %78 = qnn.requantize(%77, 0.000287892f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %79 = qnn.conv2d(%78, %v_param_51, 0, 133, 0.0235285f, 0.0509263f, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %80 = nn.bias_add(%79, %v_param_52, axis=3);
  %81 = qnn.requantize(%80, 0.00119822f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %82 = qnn.conv2d(%81, %v_param_53, 0, 126, 0.0235285f, 0.0130952f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %83 = nn.bias_add(%82, %v_param_54, axis=3);
  %84 = qnn.requantize(%83, 0.000308111f, 0, 0.152346f, 125, axis=3, out_dtype="uint8");
  %85 = qnn.add(%84, %75, 0.152346f, 125, 0.197618f, 120, 0.209317f, 123);
  %86 = qnn.conv2d(%85, %v_param_55, 123, 127, 0.209317f, 0.00133576f, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %87 = nn.bias_add(%86, %v_param_56, axis=3);
  %88 = qnn.requantize(%87, 0.000279598f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %89 = qnn.conv2d(%88, %v_param_57, 0, 156, 0.0235285f, 0.0404159f, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %90 = nn.bias_add(%89, %v_param_58, axis=3);
  %91 = qnn.requantize(%90, 0.000950924f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %92 = qnn.conv2d(%91, %v_param_59, 0, 148, 0.0235285f, 0.0192269f, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %93 = nn.bias_add(%92, %v_param_60, axis=3);
  %94 = qnn.requantize(%93, 0.00045238f, 0, 0.16256f, 119, axis=3, out_dtype="uint8");
  %95 = qnn.add(%94, %85, 0.16256f, 119, 0.209317f, 123, 0.227132f, 122);
  %96 = qnn.conv2d(%95, %v_param_61, 122, 132, 0.227132f, 0.00162901f, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %97 = nn.bias_add(%96, %v_param_62, axis=3);
  %98 = qnn.requantize(%97, 0.000370001f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %99 = qnn.conv2d(%98, %v_param_63, 0, 142, 0.0235285f, 0.0308997f, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %100 = nn.bias_add(%99, %v_param_64, axis=3);
  %101 = qnn.requantize(%100, 0.000727024f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %102 = qnn.conv2d(%101, %v_param_65, 0, 128, 0.0235285f, 0.00727967f, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %103 = nn.bias_add(%102, %v_param_66, axis=3);
  %104 = qnn.requantize(%103, 0.000171279f, 0, 0.172015f, 128, axis=3, out_dtype="uint8");
  %105 = qnn.conv2d(%104, %v_param_67, 128, 131, 0.172015f, 0.00161979f, padding=[0, 0, 0, 0], channels=576, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %106 = nn.bias_add(%105, %v_param_68, axis=3);
  %107 = qnn.requantize(%106, 0.000278629f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %108 = qnn.conv2d(%107, %v_param_69, 0, 66, 0.0235285f, 0.0708156f, padding=[1, 1, 1, 1], groups=576, channels=576, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %109 = nn.bias_add(%108, %v_param_70, axis=3);
  %110 = qnn.requantize(%109, 0.00166618f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %111 = qnn.conv2d(%110, %v_param_71, 0, 135, 0.0235285f, 0.00841983f, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %112 = nn.bias_add(%111, %v_param_72, axis=3);
  %113 = qnn.requantize(%112, 0.000198106f, 0, 0.128486f, 127, axis=3, out_dtype="uint8");
  %114 = qnn.add(%113, %104, 0.128486f, 127, 0.172015f, 128, 0.179783f, 126);
  %115 = qnn.conv2d(%114, %v_param_73, 126, 138, 0.179783f, 0.00180177f, padding=[0, 0, 0, 0], channels=576, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %116 = nn.bias_add(%115, %v_param_74, axis=3);
  %117 = qnn.requantize(%116, 0.000323928f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %118 = qnn.conv2d(%117, %v_param_75, 0, 154, 0.0235285f, 0.0698695f, padding=[1, 1, 1, 1], groups=576, channels=576, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %119 = nn.bias_add(%118, %v_param_76, axis=3);
  %120 = qnn.requantize(%119, 0.00164392f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %121 = qnn.conv2d(%120, %v_param_77, 0, 155, 0.0235285f, 0.0236749f, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %122 = nn.bias_add(%121, %v_param_78, axis=3);
  %123 = qnn.requantize(%122, 0.000557034f, 0, 0.190479f, 127, axis=3, out_dtype="uint8");
  %124 = qnn.add(%123, %114, 0.190479f, 127, 0.179783f, 126, 0.245143f, 126);
  %125 = qnn.conv2d(%124, %v_param_79, 126, 125, 0.245143f, 0.00139799f, padding=[0, 0, 0, 0], channels=576, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %126 = nn.bias_add(%125, %v_param_80, axis=3);
  %127 = qnn.requantize(%126, 0.000342707f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %128 = qnn.conv2d(%127, %v_param_81, 0, 92, 0.0235285f, 0.0148872f, strides=[2, 2], padding=[0, 0, 1, 1], groups=576, channels=576, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %129 = nn.bias_add(%128, %v_param_82, axis=3);
  %130 = qnn.requantize(%129, 0.000350273f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %131 = qnn.conv2d(%130, %v_param_83, 0, 139, 0.0235285f, 0.00922072f, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %132 = nn.bias_add(%131, %v_param_84, axis=3);
  %133 = qnn.requantize(%132, 0.00021695f, 0, 0.131885f, 131, axis=3, out_dtype="uint8");
  %134 = qnn.conv2d(%133, %v_param_85, 131, 141, 0.131885f, 0.00211018f, padding=[0, 0, 0, 0], channels=960, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %135 = nn.bias_add(%134, %v_param_86, axis=3);
  %136 = qnn.requantize(%135, 0.000278301f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %137 = qnn.conv2d(%136, %v_param_87, 0, 146, 0.0235285f, 0.0409658f, padding=[1, 1, 1, 1], groups=960, channels=960, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %138 = nn.bias_add(%137, %v_param_88, axis=3);
  %139 = qnn.requantize(%138, 0.000963862f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %140 = qnn.conv2d(%139, %v_param_89, 0, 136, 0.0235285f, 0.00783742f, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %141 = nn.bias_add(%140, %v_param_90, axis=3);
  %142 = qnn.requantize(%141, 0.000184403f, 0, 0.104162f, 130, axis=3, out_dtype="uint8");
  %143 = qnn.add(%142, %133, 0.104162f, 130, 0.131885f, 131, 0.15034f, 133);
  %144 = qnn.conv2d(%143, %v_param_91, 133, 129, 0.15034f, 0.00163117f, padding=[0, 0, 0, 0], channels=960, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %145 = nn.bias_add(%144, %v_param_92, axis=3);
  %146 = qnn.requantize(%145, 0.00024523f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %147 = qnn.conv2d(%146, %v_param_93, 0, 102, 0.0235285f, 0.0439425f, padding=[1, 1, 1, 1], groups=960, channels=960, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %148 = nn.bias_add(%147, %v_param_94, axis=3);
  %149 = qnn.requantize(%148, 0.0010339f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %150 = qnn.conv2d(%149, %v_param_95, 0, 132, 0.0235285f, 0.0380282f, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %151 = nn.bias_add(%150, %v_param_96, axis=3);
  %152 = qnn.requantize(%151, 0.000894746f, 0, 0.179058f, 134, axis=3, out_dtype="uint8");
  %153 = qnn.add(%152, %143, 0.179058f, 134, 0.15034f, 133, 0.220417f, 131);
  %154 = qnn.conv2d(%153, %v_param_97, 131, 131, 0.220417f, 0.00206415f, padding=[0, 0, 0, 0], channels=960, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %155 = nn.bias_add(%154, %v_param_98, axis=3);
  %156 = qnn.requantize(%155, 0.000454974f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %157 = qnn.conv2d(%156, %v_param_99, 0, 201, 0.0235285f, 0.158864f, padding=[1, 1, 1, 1], groups=960, channels=960, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32");
  %158 = nn.bias_add(%157, %v_param_100, axis=3);
  %159 = qnn.requantize(%158, 0.00373784f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %160 = qnn.conv2d(%159, %v_param_101, 0, 111, 0.0235285f, 0.00962106f, padding=[0, 0, 0, 0], channels=320, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %161 = nn.bias_add(%160, %v_param_102, axis=3);
  %162 = qnn.requantize(%161, 0.000226369f, 0, 0.131263f, 143, axis=3, out_dtype="uint8");
  %163 = qnn.conv2d(%162, %v_param_103, 143, 128, 0.131263f, 0.00524072f, padding=[0, 0, 0, 0], channels=1280, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %164 = nn.bias_add(%163, %v_param_104, axis=3);
  %165 = qnn.requantize(%164, 0.000687913f, 0, 0.0235285f, 0, axis=3, out_dtype="uint8");
  %166 = cast(%165, dtype="int32");
  %167 = nn.avg_pool2d(%166, pool_size=[7, 7], padding=[0, 0, 0, 0], layout="NHWC");
  %168 = cast(%167, dtype="uint8");
  %169 = qnn.conv2d(%168, %v_param_105, 0, 114, 0.0235285f, 0.00168582f, padding=[0, 0, 0, 0], channels=1001, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32");
  %170 = nn.bias_add(%169, %v_param_106, axis=3);
  %171 = qnn.requantize(%170, 3.96648e-05f, 0, 0.0760416f, 72, axis=3, out_dtype="uint8");
  %172 = reshape(%171, newshape=[1, 1001]);
  %173 = qnn.dequantize(%172, 0.0760416f, 72);
  %174 = nn.softmax(%173, axis=1);
  qnn.quantize(%174, 0.00390625f, 0, out_dtype="uint8")
}

vsi_npu.py --> qnn.dequantize
vsi_npu.py --> nn.softmax
vsi_npu.py --> qnn.quantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> nn.avg_pool2d
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.requantize
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> qnn.add
vsi_npu.py --> reshape
def @main(%input: Tensor[(1, 224, 224, 3), uint8]) -> Tensor[(1, 1001), uint8] {
  @tvmgen_default_vsi_npu_0(%input) /* ty=Tensor[(1, 1001), uint8] */
}

def @tvmgen_default_vsi_npu_0(%vsi_npu_0_i0: Tensor[(1, 224, 224, 3), uint8], Inline=1, Compiler="vsi_npu", global_symbol="tvmgen_default_vsi_npu_0", Primitive=1) -> Tensor[(1, 1001), uint8] {
  %110 = fn (%FunctionVar_52_0: Tensor[(1, 224, 224, 3), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 112, 112, 32), uint8] {
    %108 = qnn.conv2d(%FunctionVar_52_0, meta[relay.Constant][104] /* ty=Tensor[(3, 3, 3, 32), uint8] */, 128 /* ty=int32 */, 115 /* ty=int32 */, 0.00787402f /* ty=float32 */, 0.0287749f /* ty=float32 */, strides=[2, 2], padding=[0, 0, 1, 1], channels=32, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 112, 112, 32), int32] */;
    %109 = nn.bias_add(%108, meta[relay.Constant][105] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 112, 112, 32), int32] */;
    qnn.requantize(%109, 0.000226574f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 112, 112, 32), uint8] */
  };
  %111 = %110(%vsi_npu_0_i0) /* ty=Tensor[(1, 112, 112, 32), uint8] */;
  %112 = fn (%FunctionVar_51_0: Tensor[(1, 112, 112, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 112, 112, 32), uint8] {
    %106 = qnn.conv2d(%FunctionVar_51_0, meta[relay.Constant][102] /* ty=Tensor[(3, 3, 32, 1), uint8] */, 0 /* ty=int32 */, 165 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.343696f /* ty=float32 */, padding=[1, 1, 1, 1], groups=32, channels=32, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 112, 112, 32), int32] */;
    %107 = nn.bias_add(%106, meta[relay.Constant][103] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 112, 112, 32), int32] */;
    qnn.requantize(%107, 0.00808663f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 112, 112, 32), uint8] */
  };
  %113 = %112(%111) /* ty=Tensor[(1, 112, 112, 32), uint8] */;
  %114 = fn (%FunctionVar_50_0: Tensor[(1, 112, 112, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 112, 112, 16), uint8] {
    %104 = qnn.conv2d(%FunctionVar_50_0, meta[relay.Constant][100] /* ty=Tensor[(1, 1, 32, 16), uint8] */, 0 /* ty=int32 */, 141 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0381986f /* ty=float32 */, padding=[0, 0, 0, 0], channels=16, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 112, 112, 16), int32] */;
    %105 = nn.bias_add(%104, meta[relay.Constant][101] /* ty=Tensor[(16), int32] */, axis=3) /* ty=Tensor[(1, 112, 112, 16), int32] */;
    qnn.requantize(%105, 0.000898756f /* ty=float32 */, 0 /* ty=int32 */, 0.362873f /* ty=float32 */, 122 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 112, 112, 16), uint8] */
  };
  %115 = %114(%113) /* ty=Tensor[(1, 112, 112, 16), uint8] */;
  %116 = fn (%FunctionVar_49_0: Tensor[(1, 112, 112, 16), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 112, 112, 96), uint8] {
    %102 = qnn.conv2d(%FunctionVar_49_0, meta[relay.Constant][98] /* ty=Tensor[(1, 1, 16, 96), uint8] */, 122 /* ty=int32 */, 127 /* ty=int32 */, 0.362873f /* ty=float32 */, 0.00954309f /* ty=float32 */, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 112, 112, 96), int32] */;
    %103 = nn.bias_add(%102, meta[relay.Constant][99] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 112, 112, 96), int32] */;
    qnn.requantize(%103, 0.00346293f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 112, 112, 96), uint8] */
  };
  %117 = %116(%115) /* ty=Tensor[(1, 112, 112, 96), uint8] */;
  %118 = fn (%FunctionVar_48_0: Tensor[(1, 112, 112, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 96), uint8] {
    %100 = qnn.conv2d(%FunctionVar_48_0, meta[relay.Constant][96] /* ty=Tensor[(3, 3, 96, 1), uint8] */, 0 /* ty=int32 */, 109 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0194444f /* ty=float32 */, strides=[2, 2], padding=[0, 0, 1, 1], groups=96, channels=96, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 96), int32] */;
    %101 = nn.bias_add(%100, meta[relay.Constant][97] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 96), int32] */;
    qnn.requantize(%101, 0.000457496f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 96), uint8] */
  };
  %119 = %118(%117) /* ty=Tensor[(1, 56, 56, 96), uint8] */;
  %120 = fn (%FunctionVar_47_0: Tensor[(1, 56, 56, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 24), uint8] {
    %98 = qnn.conv2d(%FunctionVar_47_0, meta[relay.Constant][94] /* ty=Tensor[(1, 1, 96, 24), uint8] */, 0 /* ty=int32 */, 152 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0225397f /* ty=float32 */, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 24), int32] */;
    %99 = nn.bias_add(%98, meta[relay.Constant][95] /* ty=Tensor[(24), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 24), int32] */;
    qnn.requantize(%99, 0.000530324f /* ty=float32 */, 0 /* ty=int32 */, 0.282426f /* ty=float32 */, 122 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 24), uint8] */
  };
  %121 = %120(%119) /* ty=Tensor[(1, 56, 56, 24), uint8] */;
  %122 = fn (%FunctionVar_46_0: Tensor[(1, 56, 56, 24), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 144), uint8] {
    %96 = qnn.conv2d(%FunctionVar_46_0, meta[relay.Constant][92] /* ty=Tensor[(1, 1, 24, 144), uint8] */, 122 /* ty=int32 */, 145 /* ty=int32 */, 0.282426f /* ty=float32 */, 0.00369501f /* ty=float32 */, padding=[0, 0, 0, 0], channels=144, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 144), int32] */;
    %97 = nn.bias_add(%96, meta[relay.Constant][93] /* ty=Tensor[(144), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 144), int32] */;
    qnn.requantize(%97, 0.00104357f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 144), uint8] */
  };
  %123 = %122(%121) /* ty=Tensor[(1, 56, 56, 144), uint8] */;
  %124 = fn (%FunctionVar_45_0: Tensor[(1, 56, 56, 144), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 144), uint8] {
    %94 = qnn.conv2d(%FunctionVar_45_0, meta[relay.Constant][90] /* ty=Tensor[(3, 3, 144, 1), uint8] */, 0 /* ty=int32 */, 52 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.169819f /* ty=float32 */, padding=[1, 1, 1, 1], groups=144, channels=144, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 144), int32] */;
    %95 = nn.bias_add(%94, meta[relay.Constant][91] /* ty=Tensor[(144), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 144), int32] */;
    qnn.requantize(%95, 0.00399559f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 144), uint8] */
  };
  %125 = %124(%123) /* ty=Tensor[(1, 56, 56, 144), uint8] */;
  %126 = fn (%FunctionVar_44_0: Tensor[(1, 56, 56, 144), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 24), uint8] {
    %92 = qnn.conv2d(%FunctionVar_44_0, meta[relay.Constant][88] /* ty=Tensor[(1, 1, 144, 24), uint8] */, 0 /* ty=int32 */, 122 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.026759f /* ty=float32 */, padding=[0, 0, 0, 0], channels=24, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 24), int32] */;
    %93 = nn.bias_add(%92, meta[relay.Constant][89] /* ty=Tensor[(24), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 24), int32] */;
    qnn.requantize(%93, 0.000629599f /* ty=float32 */, 0 /* ty=int32 */, 0.410429f /* ty=float32 */, 137 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 24), uint8] */
  };
  %127 = %126(%125) /* ty=Tensor[(1, 56, 56, 24), uint8] */;
  %128 = qnn.add(%127, %121, 0.410429f /* ty=float32 */, 137 /* ty=int32 */, 0.282426f /* ty=float32 */, 122 /* ty=int32 */, 0.448443f /* ty=float32 */, 130 /* ty=int32 */) /* ty=Tensor[(1, 56, 56, 24), uint8] */;
  %129 = fn (%FunctionVar_43_0: Tensor[(1, 56, 56, 24), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 56, 56, 144), uint8] {
    %90 = qnn.conv2d(%FunctionVar_43_0, meta[relay.Constant][86] /* ty=Tensor[(1, 1, 24, 144), uint8] */, 130 /* ty=int32 */, 104 /* ty=int32 */, 0.448443f /* ty=float32 */, 0.0029434f /* ty=float32 */, padding=[0, 0, 0, 0], channels=144, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 56, 56, 144), int32] */;
    %91 = nn.bias_add(%90, meta[relay.Constant][87] /* ty=Tensor[(144), int32] */, axis=3) /* ty=Tensor[(1, 56, 56, 144), int32] */;
    qnn.requantize(%91, 0.00131995f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 56, 56, 144), uint8] */
  };
  %130 = %129(%128) /* ty=Tensor[(1, 56, 56, 144), uint8] */;
  %131 = fn (%FunctionVar_42_0: Tensor[(1, 56, 56, 144), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 144), uint8] {
    %88 = qnn.conv2d(%FunctionVar_42_0, meta[relay.Constant][84] /* ty=Tensor[(3, 3, 144, 1), uint8] */, 0 /* ty=int32 */, 144 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0171147f /* ty=float32 */, strides=[2, 2], padding=[0, 0, 1, 1], groups=144, channels=144, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 144), int32] */;
    %89 = nn.bias_add(%88, meta[relay.Constant][85] /* ty=Tensor[(144), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 144), int32] */;
    qnn.requantize(%89, 0.000402683f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 144), uint8] */
  };
  %132 = %131(%130) /* ty=Tensor[(1, 28, 28, 144), uint8] */;
  %133 = fn (%FunctionVar_41_0: Tensor[(1, 28, 28, 144), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 32), uint8] {
    %86 = qnn.conv2d(%FunctionVar_41_0, meta[relay.Constant][82] /* ty=Tensor[(1, 1, 144, 32), uint8] */, 0 /* ty=int32 */, 114 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.016776f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 32), int32] */;
    %87 = nn.bias_add(%86, meta[relay.Constant][83] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 32), int32] */;
    qnn.requantize(%87, 0.000394715f /* ty=float32 */, 0 /* ty=int32 */, 0.224783f /* ty=float32 */, 128 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 32), uint8] */
  };
  %134 = %133(%132) /* ty=Tensor[(1, 28, 28, 32), uint8] */;
  %135 = fn (%FunctionVar_40_0: Tensor[(1, 28, 28, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 192), uint8] {
    %84 = qnn.conv2d(%FunctionVar_40_0, meta[relay.Constant][80] /* ty=Tensor[(1, 1, 32, 192), uint8] */, 128 /* ty=int32 */, 122 /* ty=int32 */, 0.224783f /* ty=float32 */, 0.00210703f /* ty=float32 */, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 192), int32] */;
    %85 = nn.bias_add(%84, meta[relay.Constant][81] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 192), int32] */;
    qnn.requantize(%85, 0.000473626f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 192), uint8] */
  };
  %136 = %135(%134) /* ty=Tensor[(1, 28, 28, 192), uint8] */;
  %137 = fn (%FunctionVar_39_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 192), uint8] {
    %82 = qnn.conv2d(%FunctionVar_39_0, meta[relay.Constant][78] /* ty=Tensor[(3, 3, 192, 1), uint8] */, 0 /* ty=int32 */, 111 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0671548f /* ty=float32 */, padding=[1, 1, 1, 1], groups=192, channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 192), int32] */;
    %83 = nn.bias_add(%82, meta[relay.Constant][79] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 192), int32] */;
    qnn.requantize(%83, 0.00158005f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 192), uint8] */
  };
  %138 = %137(%136) /* ty=Tensor[(1, 28, 28, 192), uint8] */;
  %139 = fn (%FunctionVar_38_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 32), uint8] {
    %80 = qnn.conv2d(%FunctionVar_38_0, meta[relay.Constant][76] /* ty=Tensor[(1, 1, 192, 32), uint8] */, 0 /* ty=int32 */, 148 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0199821f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 32), int32] */;
    %81 = nn.bias_add(%80, meta[relay.Constant][77] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 32), int32] */;
    qnn.requantize(%81, 0.000470149f /* ty=float32 */, 0 /* ty=int32 */, 0.231107f /* ty=float32 */, 120 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 32), uint8] */
  };
  %140 = %139(%138) /* ty=Tensor[(1, 28, 28, 32), uint8] */;
  %141 = qnn.add(%140, %134, 0.231107f /* ty=float32 */, 120 /* ty=int32 */, 0.224783f /* ty=float32 */, 128 /* ty=int32 */, 0.271938f /* ty=float32 */, 130 /* ty=int32 */) /* ty=Tensor[(1, 28, 28, 32), uint8] */;
  %142 = fn (%FunctionVar_37_0: Tensor[(1, 28, 28, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 192), uint8] {
    %78 = qnn.conv2d(%FunctionVar_37_0, meta[relay.Constant][74] /* ty=Tensor[(1, 1, 32, 192), uint8] */, 130 /* ty=int32 */, 119 /* ty=int32 */, 0.271938f /* ty=float32 */, 0.00149126f /* ty=float32 */, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 192), int32] */;
    %79 = nn.bias_add(%78, meta[relay.Constant][75] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 192), int32] */;
    qnn.requantize(%79, 0.00040553f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 192), uint8] */
  };
  %143 = %142(%141) /* ty=Tensor[(1, 28, 28, 192), uint8] */;
  %144 = fn (%FunctionVar_36_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 192), uint8] {
    %76 = qnn.conv2d(%FunctionVar_36_0, meta[relay.Constant][72] /* ty=Tensor[(3, 3, 192, 1), uint8] */, 0 /* ty=int32 */, 89 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0805961f /* ty=float32 */, padding=[1, 1, 1, 1], groups=192, channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 192), int32] */;
    %77 = nn.bias_add(%76, meta[relay.Constant][73] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 192), int32] */;
    qnn.requantize(%77, 0.0018963f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 192), uint8] */
  };
  %145 = %144(%143) /* ty=Tensor[(1, 28, 28, 192), uint8] */;
  %146 = fn (%FunctionVar_35_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 32), uint8] {
    %74 = qnn.conv2d(%FunctionVar_35_0, meta[relay.Constant][70] /* ty=Tensor[(1, 1, 192, 32), uint8] */, 0 /* ty=int32 */, 127 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.018966f /* ty=float32 */, padding=[0, 0, 0, 0], channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 32), int32] */;
    %75 = nn.bias_add(%74, meta[relay.Constant][71] /* ty=Tensor[(32), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 32), int32] */;
    qnn.requantize(%75, 0.00044624f /* ty=float32 */, 0 /* ty=int32 */, 0.268485f /* ty=float32 */, 124 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 32), uint8] */
  };
  %147 = %146(%145) /* ty=Tensor[(1, 28, 28, 32), uint8] */;
  %148 = qnn.add(%147, %141, 0.268485f /* ty=float32 */, 124 /* ty=int32 */, 0.271938f /* ty=float32 */, 130 /* ty=int32 */, 0.349583f /* ty=float32 */, 124 /* ty=int32 */) /* ty=Tensor[(1, 28, 28, 32), uint8] */;
  %149 = fn (%FunctionVar_34_0: Tensor[(1, 28, 28, 32), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 28, 28, 192), uint8] {
    %72 = qnn.conv2d(%FunctionVar_34_0, meta[relay.Constant][68] /* ty=Tensor[(1, 1, 32, 192), uint8] */, 124 /* ty=int32 */, 129 /* ty=int32 */, 0.349583f /* ty=float32 */, 0.00188541f /* ty=float32 */, padding=[0, 0, 0, 0], channels=192, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 28, 28, 192), int32] */;
    %73 = nn.bias_add(%72, meta[relay.Constant][69] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 28, 28, 192), int32] */;
    qnn.requantize(%73, 0.000659109f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 28, 28, 192), uint8] */
  };
  %150 = %149(%148) /* ty=Tensor[(1, 28, 28, 192), uint8] */;
  %151 = fn (%FunctionVar_33_0: Tensor[(1, 28, 28, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 192), uint8] {
    %70 = qnn.conv2d(%FunctionVar_33_0, meta[relay.Constant][66] /* ty=Tensor[(3, 3, 192, 1), uint8] */, 0 /* ty=int32 */, 129 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00993869f /* ty=float32 */, strides=[2, 2], padding=[0, 0, 1, 1], groups=192, channels=192, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 192), int32] */;
    %71 = nn.bias_add(%70, meta[relay.Constant][67] /* ty=Tensor[(192), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 192), int32] */;
    qnn.requantize(%71, 0.000233842f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 192), uint8] */
  };
  %152 = %151(%150) /* ty=Tensor[(1, 14, 14, 192), uint8] */;
  %153 = fn (%FunctionVar_32_0: Tensor[(1, 14, 14, 192), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] {
    %68 = qnn.conv2d(%FunctionVar_32_0, meta[relay.Constant][64] /* ty=Tensor[(1, 1, 192, 64), uint8] */, 0 /* ty=int32 */, 144 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0145759f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */;
    %69 = nn.bias_add(%68, meta[relay.Constant][65] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */;
    qnn.requantize(%69, 0.000342948f /* ty=float32 */, 0 /* ty=int32 */, 0.193133f /* ty=float32 */, 125 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */
  };
  %154 = %153(%152) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %155 = fn (%FunctionVar_31_0: Tensor[(1, 14, 14, 64), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] {
    %66 = qnn.conv2d(%FunctionVar_31_0, meta[relay.Constant][62] /* ty=Tensor[(1, 1, 64, 384), uint8] */, 125 /* ty=int32 */, 126 /* ty=int32 */, 0.193133f /* ty=float32 */, 0.00157124f /* ty=float32 */, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */;
    %67 = nn.bias_add(%66, meta[relay.Constant][63] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */;
    qnn.requantize(%67, 0.000303459f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */
  };
  %156 = %155(%154) /* ty=Tensor[(1, 14, 14, 384), uint8] */;
  %157 = fn (%FunctionVar_30_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] {
    %64 = qnn.conv2d(%FunctionVar_30_0, meta[relay.Constant][60] /* ty=Tensor[(3, 3, 384, 1), uint8] */, 0 /* ty=int32 */, 105 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0612184f /* ty=float32 */, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */;
    %65 = nn.bias_add(%64, meta[relay.Constant][61] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */;
    qnn.requantize(%65, 0.00144038f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */
  };
  %158 = %157(%156) /* ty=Tensor[(1, 14, 14, 384), uint8] */;
  %159 = fn (%FunctionVar_29_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] {
    %62 = qnn.conv2d(%FunctionVar_29_0, meta[relay.Constant][58] /* ty=Tensor[(1, 1, 384, 64), uint8] */, 0 /* ty=int32 */, 127 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0187498f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */;
    %63 = nn.bias_add(%62, meta[relay.Constant][59] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */;
    qnn.requantize(%63, 0.000441155f /* ty=float32 */, 0 /* ty=int32 */, 0.180298f /* ty=float32 */, 108 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */
  };
  %160 = %159(%158) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %161 = qnn.add(%160, %154, 0.180298f /* ty=float32 */, 108 /* ty=int32 */, 0.193133f /* ty=float32 */, 125 /* ty=int32 */, 0.197618f /* ty=float32 */, 120 /* ty=int32 */) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %162 = fn (%FunctionVar_28_0: Tensor[(1, 14, 14, 64), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] {
    %60 = qnn.conv2d(%FunctionVar_28_0, meta[relay.Constant][56] /* ty=Tensor[(1, 1, 64, 384), uint8] */, 120 /* ty=int32 */, 135 /* ty=int32 */, 0.197618f /* ty=float32 */, 0.00145681f /* ty=float32 */, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */;
    %61 = nn.bias_add(%60, meta[relay.Constant][57] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */;
    qnn.requantize(%61, 0.000287892f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */
  };
  %163 = %162(%161) /* ty=Tensor[(1, 14, 14, 384), uint8] */;
  %164 = fn (%FunctionVar_27_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] {
    %58 = qnn.conv2d(%FunctionVar_27_0, meta[relay.Constant][54] /* ty=Tensor[(3, 3, 384, 1), uint8] */, 0 /* ty=int32 */, 133 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0509263f /* ty=float32 */, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */;
    %59 = nn.bias_add(%58, meta[relay.Constant][55] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */;
    qnn.requantize(%59, 0.00119822f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */
  };
  %165 = %164(%163) /* ty=Tensor[(1, 14, 14, 384), uint8] */;
  %166 = fn (%FunctionVar_26_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] {
    %56 = qnn.conv2d(%FunctionVar_26_0, meta[relay.Constant][52] /* ty=Tensor[(1, 1, 384, 64), uint8] */, 0 /* ty=int32 */, 126 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0130952f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */;
    %57 = nn.bias_add(%56, meta[relay.Constant][53] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */;
    qnn.requantize(%57, 0.000308111f /* ty=float32 */, 0 /* ty=int32 */, 0.152346f /* ty=float32 */, 125 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */
  };
  %167 = %166(%165) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %168 = qnn.add(%167, %161, 0.152346f /* ty=float32 */, 125 /* ty=int32 */, 0.197618f /* ty=float32 */, 120 /* ty=int32 */, 0.209317f /* ty=float32 */, 123 /* ty=int32 */) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %169 = fn (%FunctionVar_25_0: Tensor[(1, 14, 14, 64), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] {
    %54 = qnn.conv2d(%FunctionVar_25_0, meta[relay.Constant][50] /* ty=Tensor[(1, 1, 64, 384), uint8] */, 123 /* ty=int32 */, 127 /* ty=int32 */, 0.209317f /* ty=float32 */, 0.00133576f /* ty=float32 */, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */;
    %55 = nn.bias_add(%54, meta[relay.Constant][51] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */;
    qnn.requantize(%55, 0.000279598f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */
  };
  %170 = %169(%168) /* ty=Tensor[(1, 14, 14, 384), uint8] */;
  %171 = fn (%FunctionVar_24_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] {
    %52 = qnn.conv2d(%FunctionVar_24_0, meta[relay.Constant][48] /* ty=Tensor[(3, 3, 384, 1), uint8] */, 0 /* ty=int32 */, 156 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0404159f /* ty=float32 */, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */;
    %53 = nn.bias_add(%52, meta[relay.Constant][49] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */;
    qnn.requantize(%53, 0.000950924f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */
  };
  %172 = %171(%170) /* ty=Tensor[(1, 14, 14, 384), uint8] */;
  %173 = fn (%FunctionVar_23_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 64), uint8] {
    %50 = qnn.conv2d(%FunctionVar_23_0, meta[relay.Constant][46] /* ty=Tensor[(1, 1, 384, 64), uint8] */, 0 /* ty=int32 */, 148 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0192269f /* ty=float32 */, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 64), int32] */;
    %51 = nn.bias_add(%50, meta[relay.Constant][47] /* ty=Tensor[(64), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 64), int32] */;
    qnn.requantize(%51, 0.00045238f /* ty=float32 */, 0 /* ty=int32 */, 0.16256f /* ty=float32 */, 119 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 64), uint8] */
  };
  %174 = %173(%172) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %175 = qnn.add(%174, %168, 0.16256f /* ty=float32 */, 119 /* ty=int32 */, 0.209317f /* ty=float32 */, 123 /* ty=int32 */, 0.227132f /* ty=float32 */, 122 /* ty=int32 */) /* ty=Tensor[(1, 14, 14, 64), uint8] */;
  %176 = fn (%FunctionVar_22_0: Tensor[(1, 14, 14, 64), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] {
    %48 = qnn.conv2d(%FunctionVar_22_0, meta[relay.Constant][44] /* ty=Tensor[(1, 1, 64, 384), uint8] */, 122 /* ty=int32 */, 132 /* ty=int32 */, 0.227132f /* ty=float32 */, 0.00162901f /* ty=float32 */, padding=[0, 0, 0, 0], channels=384, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */;
    %49 = nn.bias_add(%48, meta[relay.Constant][45] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */;
    qnn.requantize(%49, 0.000370001f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */
  };
  %177 = %176(%175) /* ty=Tensor[(1, 14, 14, 384), uint8] */;
  %178 = fn (%FunctionVar_21_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 384), uint8] {
    %46 = qnn.conv2d(%FunctionVar_21_0, meta[relay.Constant][42] /* ty=Tensor[(3, 3, 384, 1), uint8] */, 0 /* ty=int32 */, 142 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0308997f /* ty=float32 */, padding=[1, 1, 1, 1], groups=384, channels=384, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 384), int32] */;
    %47 = nn.bias_add(%46, meta[relay.Constant][43] /* ty=Tensor[(384), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 384), int32] */;
    qnn.requantize(%47, 0.000727024f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 384), uint8] */
  };
  %179 = %178(%177) /* ty=Tensor[(1, 14, 14, 384), uint8] */;
  %180 = fn (%FunctionVar_20_0: Tensor[(1, 14, 14, 384), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 96), uint8] {
    %44 = qnn.conv2d(%FunctionVar_20_0, meta[relay.Constant][40] /* ty=Tensor[(1, 1, 384, 96), uint8] */, 0 /* ty=int32 */, 128 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00727967f /* ty=float32 */, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 96), int32] */;
    %45 = nn.bias_add(%44, meta[relay.Constant][41] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 96), int32] */;
    qnn.requantize(%45, 0.000171279f /* ty=float32 */, 0 /* ty=int32 */, 0.172015f /* ty=float32 */, 128 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 96), uint8] */
  };
  %181 = %180(%179) /* ty=Tensor[(1, 14, 14, 96), uint8] */;
  %182 = fn (%FunctionVar_19_0: Tensor[(1, 14, 14, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 576), uint8] {
    %42 = qnn.conv2d(%FunctionVar_19_0, meta[relay.Constant][38] /* ty=Tensor[(1, 1, 96, 576), uint8] */, 128 /* ty=int32 */, 131 /* ty=int32 */, 0.172015f /* ty=float32 */, 0.00161979f /* ty=float32 */, padding=[0, 0, 0, 0], channels=576, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 576), int32] */;
    %43 = nn.bias_add(%42, meta[relay.Constant][39] /* ty=Tensor[(576), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 576), int32] */;
    qnn.requantize(%43, 0.000278629f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 576), uint8] */
  };
  %183 = %182(%181) /* ty=Tensor[(1, 14, 14, 576), uint8] */;
  %184 = fn (%FunctionVar_18_0: Tensor[(1, 14, 14, 576), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 576), uint8] {
    %40 = qnn.conv2d(%FunctionVar_18_0, meta[relay.Constant][36] /* ty=Tensor[(3, 3, 576, 1), uint8] */, 0 /* ty=int32 */, 66 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0708156f /* ty=float32 */, padding=[1, 1, 1, 1], groups=576, channels=576, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 576), int32] */;
    %41 = nn.bias_add(%40, meta[relay.Constant][37] /* ty=Tensor[(576), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 576), int32] */;
    qnn.requantize(%41, 0.00166618f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 576), uint8] */
  };
  %185 = %184(%183) /* ty=Tensor[(1, 14, 14, 576), uint8] */;
  %186 = fn (%FunctionVar_17_0: Tensor[(1, 14, 14, 576), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 96), uint8] {
    %38 = qnn.conv2d(%FunctionVar_17_0, meta[relay.Constant][34] /* ty=Tensor[(1, 1, 576, 96), uint8] */, 0 /* ty=int32 */, 135 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00841983f /* ty=float32 */, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 96), int32] */;
    %39 = nn.bias_add(%38, meta[relay.Constant][35] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 96), int32] */;
    qnn.requantize(%39, 0.000198106f /* ty=float32 */, 0 /* ty=int32 */, 0.128486f /* ty=float32 */, 127 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 96), uint8] */
  };
  %187 = %186(%185) /* ty=Tensor[(1, 14, 14, 96), uint8] */;
  %188 = qnn.add(%187, %181, 0.128486f /* ty=float32 */, 127 /* ty=int32 */, 0.172015f /* ty=float32 */, 128 /* ty=int32 */, 0.179783f /* ty=float32 */, 126 /* ty=int32 */) /* ty=Tensor[(1, 14, 14, 96), uint8] */;
  %189 = fn (%FunctionVar_16_0: Tensor[(1, 14, 14, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 576), uint8] {
    %36 = qnn.conv2d(%FunctionVar_16_0, meta[relay.Constant][32] /* ty=Tensor[(1, 1, 96, 576), uint8] */, 126 /* ty=int32 */, 138 /* ty=int32 */, 0.179783f /* ty=float32 */, 0.00180177f /* ty=float32 */, padding=[0, 0, 0, 0], channels=576, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 576), int32] */;
    %37 = nn.bias_add(%36, meta[relay.Constant][33] /* ty=Tensor[(576), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 576), int32] */;
    qnn.requantize(%37, 0.000323928f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 576), uint8] */
  };
  %190 = %189(%188) /* ty=Tensor[(1, 14, 14, 576), uint8] */;
  %191 = fn (%FunctionVar_15_0: Tensor[(1, 14, 14, 576), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 576), uint8] {
    %34 = qnn.conv2d(%FunctionVar_15_0, meta[relay.Constant][30] /* ty=Tensor[(3, 3, 576, 1), uint8] */, 0 /* ty=int32 */, 154 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0698695f /* ty=float32 */, padding=[1, 1, 1, 1], groups=576, channels=576, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 576), int32] */;
    %35 = nn.bias_add(%34, meta[relay.Constant][31] /* ty=Tensor[(576), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 576), int32] */;
    qnn.requantize(%35, 0.00164392f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 576), uint8] */
  };
  %192 = %191(%190) /* ty=Tensor[(1, 14, 14, 576), uint8] */;
  %193 = fn (%FunctionVar_14_0: Tensor[(1, 14, 14, 576), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 96), uint8] {
    %32 = qnn.conv2d(%FunctionVar_14_0, meta[relay.Constant][28] /* ty=Tensor[(1, 1, 576, 96), uint8] */, 0 /* ty=int32 */, 155 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0236749f /* ty=float32 */, padding=[0, 0, 0, 0], channels=96, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 96), int32] */;
    %33 = nn.bias_add(%32, meta[relay.Constant][29] /* ty=Tensor[(96), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 96), int32] */;
    qnn.requantize(%33, 0.000557034f /* ty=float32 */, 0 /* ty=int32 */, 0.190479f /* ty=float32 */, 127 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 96), uint8] */
  };
  %194 = %193(%192) /* ty=Tensor[(1, 14, 14, 96), uint8] */;
  %195 = qnn.add(%194, %188, 0.190479f /* ty=float32 */, 127 /* ty=int32 */, 0.179783f /* ty=float32 */, 126 /* ty=int32 */, 0.245143f /* ty=float32 */, 126 /* ty=int32 */) /* ty=Tensor[(1, 14, 14, 96), uint8] */;
  %196 = fn (%FunctionVar_13_0: Tensor[(1, 14, 14, 96), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 14, 14, 576), uint8] {
    %30 = qnn.conv2d(%FunctionVar_13_0, meta[relay.Constant][26] /* ty=Tensor[(1, 1, 96, 576), uint8] */, 126 /* ty=int32 */, 125 /* ty=int32 */, 0.245143f /* ty=float32 */, 0.00139799f /* ty=float32 */, padding=[0, 0, 0, 0], channels=576, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 14, 14, 576), int32] */;
    %31 = nn.bias_add(%30, meta[relay.Constant][27] /* ty=Tensor[(576), int32] */, axis=3) /* ty=Tensor[(1, 14, 14, 576), int32] */;
    qnn.requantize(%31, 0.000342707f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 14, 14, 576), uint8] */
  };
  %197 = %196(%195) /* ty=Tensor[(1, 14, 14, 576), uint8] */;
  %198 = fn (%FunctionVar_12_0: Tensor[(1, 14, 14, 576), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 576), uint8] {
    %28 = qnn.conv2d(%FunctionVar_12_0, meta[relay.Constant][24] /* ty=Tensor[(3, 3, 576, 1), uint8] */, 0 /* ty=int32 */, 92 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0148872f /* ty=float32 */, strides=[2, 2], padding=[0, 0, 1, 1], groups=576, channels=576, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 576), int32] */;
    %29 = nn.bias_add(%28, meta[relay.Constant][25] /* ty=Tensor[(576), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 576), int32] */;
    qnn.requantize(%29, 0.000350273f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 576), uint8] */
  };
  %199 = %198(%197) /* ty=Tensor[(1, 7, 7, 576), uint8] */;
  %200 = fn (%FunctionVar_11_0: Tensor[(1, 7, 7, 576), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 160), uint8] {
    %26 = qnn.conv2d(%FunctionVar_11_0, meta[relay.Constant][22] /* ty=Tensor[(1, 1, 576, 160), uint8] */, 0 /* ty=int32 */, 139 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00922072f /* ty=float32 */, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 160), int32] */;
    %27 = nn.bias_add(%26, meta[relay.Constant][23] /* ty=Tensor[(160), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 160), int32] */;
    qnn.requantize(%27, 0.00021695f /* ty=float32 */, 0 /* ty=int32 */, 0.131885f /* ty=float32 */, 131 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 160), uint8] */
  };
  %201 = %200(%199) /* ty=Tensor[(1, 7, 7, 160), uint8] */;
  %202 = fn (%FunctionVar_10_0: Tensor[(1, 7, 7, 160), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 960), uint8] {
    %24 = qnn.conv2d(%FunctionVar_10_0, meta[relay.Constant][20] /* ty=Tensor[(1, 1, 160, 960), uint8] */, 131 /* ty=int32 */, 141 /* ty=int32 */, 0.131885f /* ty=float32 */, 0.00211018f /* ty=float32 */, padding=[0, 0, 0, 0], channels=960, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 960), int32] */;
    %25 = nn.bias_add(%24, meta[relay.Constant][21] /* ty=Tensor[(960), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 960), int32] */;
    qnn.requantize(%25, 0.000278301f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 960), uint8] */
  };
  %203 = %202(%201) /* ty=Tensor[(1, 7, 7, 960), uint8] */;
  %204 = fn (%FunctionVar_9_0: Tensor[(1, 7, 7, 960), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 960), uint8] {
    %22 = qnn.conv2d(%FunctionVar_9_0, meta[relay.Constant][18] /* ty=Tensor[(3, 3, 960, 1), uint8] */, 0 /* ty=int32 */, 146 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0409658f /* ty=float32 */, padding=[1, 1, 1, 1], groups=960, channels=960, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 960), int32] */;
    %23 = nn.bias_add(%22, meta[relay.Constant][19] /* ty=Tensor[(960), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 960), int32] */;
    qnn.requantize(%23, 0.000963862f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 960), uint8] */
  };
  %205 = %204(%203) /* ty=Tensor[(1, 7, 7, 960), uint8] */;
  %206 = fn (%FunctionVar_8_0: Tensor[(1, 7, 7, 960), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 160), uint8] {
    %20 = qnn.conv2d(%FunctionVar_8_0, meta[relay.Constant][16] /* ty=Tensor[(1, 1, 960, 160), uint8] */, 0 /* ty=int32 */, 136 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00783742f /* ty=float32 */, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 160), int32] */;
    %21 = nn.bias_add(%20, meta[relay.Constant][17] /* ty=Tensor[(160), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 160), int32] */;
    qnn.requantize(%21, 0.000184403f /* ty=float32 */, 0 /* ty=int32 */, 0.104162f /* ty=float32 */, 130 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 160), uint8] */
  };
  %207 = %206(%205) /* ty=Tensor[(1, 7, 7, 160), uint8] */;
  %208 = qnn.add(%207, %201, 0.104162f /* ty=float32 */, 130 /* ty=int32 */, 0.131885f /* ty=float32 */, 131 /* ty=int32 */, 0.15034f /* ty=float32 */, 133 /* ty=int32 */) /* ty=Tensor[(1, 7, 7, 160), uint8] */;
  %209 = fn (%FunctionVar_7_0: Tensor[(1, 7, 7, 160), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 960), uint8] {
    %18 = qnn.conv2d(%FunctionVar_7_0, meta[relay.Constant][14] /* ty=Tensor[(1, 1, 160, 960), uint8] */, 133 /* ty=int32 */, 129 /* ty=int32 */, 0.15034f /* ty=float32 */, 0.00163117f /* ty=float32 */, padding=[0, 0, 0, 0], channels=960, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 960), int32] */;
    %19 = nn.bias_add(%18, meta[relay.Constant][15] /* ty=Tensor[(960), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 960), int32] */;
    qnn.requantize(%19, 0.00024523f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 960), uint8] */
  };
  %210 = %209(%208) /* ty=Tensor[(1, 7, 7, 960), uint8] */;
  %211 = fn (%FunctionVar_6_0: Tensor[(1, 7, 7, 960), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 960), uint8] {
    %16 = qnn.conv2d(%FunctionVar_6_0, meta[relay.Constant][12] /* ty=Tensor[(3, 3, 960, 1), uint8] */, 0 /* ty=int32 */, 102 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0439425f /* ty=float32 */, padding=[1, 1, 1, 1], groups=960, channels=960, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 960), int32] */;
    %17 = nn.bias_add(%16, meta[relay.Constant][13] /* ty=Tensor[(960), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 960), int32] */;
    qnn.requantize(%17, 0.0010339f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 960), uint8] */
  };
  %212 = %211(%210) /* ty=Tensor[(1, 7, 7, 960), uint8] */;
  %213 = fn (%FunctionVar_5_0: Tensor[(1, 7, 7, 960), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 160), uint8] {
    %14 = qnn.conv2d(%FunctionVar_5_0, meta[relay.Constant][10] /* ty=Tensor[(1, 1, 960, 160), uint8] */, 0 /* ty=int32 */, 132 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.0380282f /* ty=float32 */, padding=[0, 0, 0, 0], channels=160, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 160), int32] */;
    %15 = nn.bias_add(%14, meta[relay.Constant][11] /* ty=Tensor[(160), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 160), int32] */;
    qnn.requantize(%15, 0.000894746f /* ty=float32 */, 0 /* ty=int32 */, 0.179058f /* ty=float32 */, 134 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 160), uint8] */
  };
  %214 = %213(%212) /* ty=Tensor[(1, 7, 7, 160), uint8] */;
  %215 = qnn.add(%214, %208, 0.179058f /* ty=float32 */, 134 /* ty=int32 */, 0.15034f /* ty=float32 */, 133 /* ty=int32 */, 0.220417f /* ty=float32 */, 131 /* ty=int32 */) /* ty=Tensor[(1, 7, 7, 160), uint8] */;
  %216 = fn (%FunctionVar_4_0: Tensor[(1, 7, 7, 160), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 960), uint8] {
    %12 = qnn.conv2d(%FunctionVar_4_0, meta[relay.Constant][8] /* ty=Tensor[(1, 1, 160, 960), uint8] */, 131 /* ty=int32 */, 131 /* ty=int32 */, 0.220417f /* ty=float32 */, 0.00206415f /* ty=float32 */, padding=[0, 0, 0, 0], channels=960, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 960), int32] */;
    %13 = nn.bias_add(%12, meta[relay.Constant][9] /* ty=Tensor[(960), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 960), int32] */;
    qnn.requantize(%13, 0.000454974f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 960), uint8] */
  };
  %217 = %216(%215) /* ty=Tensor[(1, 7, 7, 960), uint8] */;
  %218 = fn (%FunctionVar_3_0: Tensor[(1, 7, 7, 960), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 960), uint8] {
    %10 = qnn.conv2d(%FunctionVar_3_0, meta[relay.Constant][6] /* ty=Tensor[(3, 3, 960, 1), uint8] */, 0 /* ty=int32 */, 201 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.158864f /* ty=float32 */, padding=[1, 1, 1, 1], groups=960, channels=960, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 960), int32] */;
    %11 = nn.bias_add(%10, meta[relay.Constant][7] /* ty=Tensor[(960), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 960), int32] */;
    qnn.requantize(%11, 0.00373784f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 960), uint8] */
  };
  %219 = %218(%217) /* ty=Tensor[(1, 7, 7, 960), uint8] */;
  %220 = fn (%FunctionVar_2_0: Tensor[(1, 7, 7, 960), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 320), uint8] {
    %8 = qnn.conv2d(%FunctionVar_2_0, meta[relay.Constant][4] /* ty=Tensor[(1, 1, 960, 320), uint8] */, 0 /* ty=int32 */, 111 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00962106f /* ty=float32 */, padding=[0, 0, 0, 0], channels=320, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 320), int32] */;
    %9 = nn.bias_add(%8, meta[relay.Constant][5] /* ty=Tensor[(320), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 320), int32] */;
    qnn.requantize(%9, 0.000226369f /* ty=float32 */, 0 /* ty=int32 */, 0.131263f /* ty=float32 */, 143 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 320), uint8] */
  };
  %221 = %220(%219) /* ty=Tensor[(1, 7, 7, 320), uint8] */;
  %222 = fn (%FunctionVar_1_0: Tensor[(1, 7, 7, 320), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 7, 7, 1280), uint8] {
    %6 = qnn.conv2d(%FunctionVar_1_0, meta[relay.Constant][2] /* ty=Tensor[(1, 1, 320, 1280), uint8] */, 143 /* ty=int32 */, 128 /* ty=int32 */, 0.131263f /* ty=float32 */, 0.00524072f /* ty=float32 */, padding=[0, 0, 0, 0], channels=1280, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 7, 7, 1280), int32] */;
    %7 = nn.bias_add(%6, meta[relay.Constant][3] /* ty=Tensor[(1280), int32] */, axis=3) /* ty=Tensor[(1, 7, 7, 1280), int32] */;
    qnn.requantize(%7, 0.000687913f /* ty=float32 */, 0 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 7, 7, 1280), uint8] */
  };
  %223 = %222(%221) /* ty=Tensor[(1, 7, 7, 1280), uint8] */;
  %224 = fn (%FunctionVar_0_02: Tensor[(1, 7, 7, 1280), uint8], PartitionedFromPattern="cast_nn.avg_pool2d_cast_", Composite="vsi_npu.qnn_avgpool2d") -> Tensor[(1, 1, 1, 1280), uint8] {
    %4 = cast(%FunctionVar_0_02, dtype="int32") /* ty=Tensor[(1, 7, 7, 1280), int32] */;
    %5 = nn.avg_pool2d(%4, pool_size=[7, 7], padding=[0, 0, 0, 0], layout="NHWC") /* ty=Tensor[(1, 1, 1, 1280), int32] */;
    cast(%5, dtype="uint8") /* ty=Tensor[(1, 1, 1, 1280), uint8] */
  };
  %225 = %224(%223) /* ty=Tensor[(1, 1, 1, 1280), uint8] */;
  %226 = fn (%FunctionVar_0_01: Tensor[(1, 1, 1, 1280), uint8], PartitionedFromPattern="qnn.conv2d_nn.bias_add_qnn.requantize_", Composite="vsi_npu.qnn_conv2d") -> Tensor[(1, 1, 1, 1001), uint8] {
    %2 = qnn.conv2d(%FunctionVar_0_01, meta[relay.Constant][0] /* ty=Tensor[(1, 1, 1280, 1001), uint8] */, 0 /* ty=int32 */, 114 /* ty=int32 */, 0.0235285f /* ty=float32 */, 0.00168582f /* ty=float32 */, padding=[0, 0, 0, 0], channels=1001, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO", out_dtype="int32") /* ty=Tensor[(1, 1, 1, 1001), int32] */;
    %3 = nn.bias_add(%2, meta[relay.Constant][1] /* ty=Tensor[(1001), int32] */, axis=3) /* ty=Tensor[(1, 1, 1, 1001), int32] */;
    qnn.requantize(%3, 3.96648e-05f /* ty=float32 */, 0 /* ty=int32 */, 0.0760416f /* ty=float32 */, 72 /* ty=int32 */, axis=3, out_dtype="uint8") /* ty=Tensor[(1, 1, 1, 1001), uint8] */
  };
  %227 = %226(%225) /* ty=Tensor[(1, 1, 1, 1001), uint8] */;
  %228 = reshape(%227, newshape=[1, 1001]) /* ty=Tensor[(1, 1001), uint8] */;
  %229 = fn (%FunctionVar_0_0: Tensor[(1, 1001), uint8], PartitionedFromPattern="qnn.dequantize_nn.softmax_qnn.quantize_", Composite="vsi_npu.qnn_softmax") -> Tensor[(1, 1001), uint8] {
    %0 = qnn.dequantize(%FunctionVar_0_0, 0.0760416f /* ty=float32 */, 72 /* ty=int32 */) /* ty=Tensor[(1, 1001), float32] */;
    %1 = nn.softmax(%0, axis=1) /* ty=Tensor[(1, 1001), float32] */;
    qnn.quantize(%1, 0.00390625f /* ty=float32 */, 0 /* ty=int32 */, out_dtype="uint8") /* ty=Tensor[(1, 1001), uint8] */
  };
  %229(%228) /* ty=Tensor[(1, 1001), uint8] */
}


This is important----> name_node.value() == tvmgen_default_vsi_npu_0
GraphMakerImpl::Create
TensorMakerImpl::InferCall: vsi_npu.qnn_softmax
TensorMakerImpl::InferCall: reshape
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_avgpool2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: qnn.add
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
TensorMakerImpl::InferCall: vsi_npu.qnn_conv2d
W [HandleLayoutInfer:268]Op 162: default layout inference pass.
VsiNpuModule::GetFunction: get_symbol
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: get_const_vars
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: get_const_vars
VsiNpuModule::GetFunction: return early
MBNtest_vsi_tflite_model_all.py:120: DeprecationWarning: legacy graph executor behavior of producing json / lib / params will be removed in the next release. Please see documents of tvm.contrib.graph_executor.GraphModule for the  new recommended usage.
  graph, lib, params  = relay.build(mod, target, params=params)
VsiNpuModule::SaveToBinary
SaveToBinary: nbg size = 5832768
SaveToBinary: input size = 1
SaveToBinary: output size = 1
VsiNpuModule : SerializeTensorSpec
VsiNpuModule : SerializeTensorSpec2
VsiNpuModule : SerializeTensorSpec
VsiNpuModule : SerializeTensorSpec2
VsiNpuModule::SaveToBinary2
Printing device code to device_code.cl...
VsiNpuModule::LoadFromBinary
LoadFromBinary: nbg size = 5832768
LoadFromBinary: input size = 1
LoadFromBinary: output size = 1
VsiNpuModule : DeSerializeTensorSpec
VsiNpuModule : DeSerializeTensorSpec2
VsiNpuModule : DeSerializeTensorSpec
VsiNpuModule : DeSerializeTensorSpec2
(1, 224, 224, 3) ############
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: _lookup_linked_param
VsiNpuModule::GetFunction: return early
VsiNpuModule::GetFunction: tvmgen_default_vsi_npu_0
Process Graph: 6 ms or 6253 us
VsiNpuModule::GetFunction: size: 2
[[  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   1   0   0   0   0   0
    0   0   0   0   0   0   0   0   1   1   0   0 112  66   7   1  24   0
   10   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   1   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0   0   0   0   0]]



Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants