Add verify test for unpack_int4 with qdq #3523

lakhinderwalia · 2024-10-11T16:03:27Z

A couple of int4 related tests are added, to verify results of a onnx snippet, where block_quantization size = 2. The other case has no block quantization. This graph has QDQ in it + identity operator + transpose, relating to a Quark generated model.

codecov · 2024-10-11T18:56:31Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.17%. Comparing base (bdbe342) to head (345f7fb).
Report is 13 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #3523      +/-   ##
===========================================
+ Coverage    92.08%   92.17%   +0.09%     
===========================================
  Files          510      512       +2     
  Lines        21094    21385     +291     
===========================================
+ Hits         19424    19712     +288     
- Misses        1670     1673       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

migraphx-bot · 2024-10-12T01:22:40Z

Test	Batch	Rate new 345f7f	Rate old bdbe34	Diff	Compare
torchvision-resnet50	64	3,261.81	3,258.84	0.09%	✅
torchvision-resnet50_fp16	64	6,996.41	6,981.91	0.21%	✅
torchvision-densenet121	32	2,437.84	2,436.52	0.05%	✅
torchvision-densenet121_fp16	32	4,086.24	4,070.78	0.38%	✅
torchvision-inceptionv3	32	1,638.02	1,640.50	-0.15%	✅
torchvision-inceptionv3_fp16	32	2,764.34	2,764.39	-0.00%	✅
cadene-inceptionv4	16	777.15	777.25	-0.01%	✅
cadene-resnext64x4	16	809.46	809.84	-0.05%	✅
slim-mobilenet	64	7,537.63	7,537.85	-0.00%	✅
slim-nasnetalarge	64	211.83	211.79	0.02%	✅
slim-resnet50v2	64	3,506.12	3,504.89	0.04%	✅
bert-mrpc-onnx	8	1,151.33	1,152.66	-0.12%	✅
bert-mrpc-tf	1	468.27	463.93	0.93%	✅
pytorch-examples-wlang-gru	1	416.23	485.39	-14.25%	🔴
pytorch-examples-wlang-lstm	1	376.94	384.58	-1.99%	✅
torchvision-resnet50_1	1	776.53	817.62	-5.03%	🔴
cadene-dpn92_1	1	426.50	399.06	6.88%	🔆
cadene-resnext101_1	1	382.02	382.30	-0.07%	✅
onnx-taau-downsample	1	343.05	343.08	-0.01%	✅
dlrm-criteoterabyte	1	33.37	33.33	0.09%	✅
dlrm-criteoterabyte_fp16	1	52.76	52.74	0.03%	✅
agentmodel	1	8,287.27	8,377.88	-1.08%	✅
unet_fp16	2	58.96	58.79	0.30%	✅
resnet50v1_fp16	1	946.75	937.71	0.96%	✅
resnet50v1_int8	1	983.82	1,041.29	-5.52%	🔴
bert_base_cased_fp16	64	1,172.16	1,170.74	0.12%	✅
bert_large_uncased_fp16	32	363.77	363.58	0.05%	✅
bert_large_fp16	1	199.01	199.03	-0.01%	✅
distilgpt2_fp16	16	2,203.05	2,199.30	0.17%	✅
yolov5s	1	541.13	540.42	0.13%	✅
tinyllama	1	43.46	43.45	0.02%	✅
vicuna-fastchat	1	174.50	169.65	2.86%	✅
whisper-tiny-encoder	1	418.60	418.63	-0.01%	✅
whisper-tiny-decoder	1	430.60	430.21	0.09%	✅

This build is not recommended to merge 🔴

migraphx-bot · 2024-10-12T01:22:41Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

✅ bert_large: PASSED: MIGraphX meets tolerance

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

verify test for unpack_int4 with qdq

345f7fb

lakhinderwalia requested a review from causten as a code owner October 11, 2024 16:03

lakhinderwalia self-assigned this Oct 11, 2024

causten requested review from pfultz2 and turneram October 12, 2024 04:24

lakhinderwalia mentioned this pull request Oct 16, 2024

[INT4] Compress model by quantizing weights to int4 #3307

Open

18 tasks

lakhinderwalia requested a review from shivadbhavsar October 16, 2024 16:18

turneram approved these changes Oct 18, 2024

View reviewed changes

causten merged commit b73defb into develop Oct 20, 2024
30 checks passed

causten deleted the lw/int4_qdq_test branch October 20, 2024 14:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add verify test for unpack_int4 with qdq #3523

Add verify test for unpack_int4 with qdq #3523

lakhinderwalia commented Oct 11, 2024

codecov bot commented Oct 11, 2024 •

edited

Loading

migraphx-bot commented Oct 12, 2024

migraphx-bot commented Oct 12, 2024

Add verify test for unpack_int4 with qdq #3523

Add verify test for unpack_int4 with qdq #3523

Conversation

lakhinderwalia commented Oct 11, 2024

codecov bot commented Oct 11, 2024 • edited Loading

Codecov Report

migraphx-bot commented Oct 12, 2024

migraphx-bot commented Oct 12, 2024

codecov bot commented Oct 11, 2024 •

edited

Loading