-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU] Support fs_b_yx_fsv32 and int8 case for pooling #27371
[GPU] Support fs_b_yx_fsv32 and int8 case for pooling #27371
Conversation
5b7e1dd
to
0b26855
Compare
@@ -9,6 +9,8 @@ ParamsKey PoolingKernelGPURef::GetSupportedKey() const { | |||
ParamsKey k; | |||
k.EnableInputDataType(Datatype::F16); | |||
k.EnableInputDataType(Datatype::F32); | |||
k.EnableInputDataType(Datatype::UINT8); | |||
k.EnableInputDataType(Datatype::INT8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you explain why we need to support int8 from pooling_ref? Reference kernel does not provide good performance. If fs_b_yx_fsv32 is the problem, could you check why this format is chosen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs_b_yx_fsv32 format and int8 input type is required on 4 E2E tests, TF_frozen_model_inception_v3_w_sym_ch_a_asym_t_acc1_78_34_INT8, TF_frozen_model_mobilenet_v2_w_sym_ch_a_asym_t_acc1_71_99_INT8, TF_Mobile_Object_Labeler_v1, TF_AiyVisionClassifier_Plants.
I've updated supporting fs_b_yx_fsv32 format on Pooling Int8Ref kernel instead of supporting it on ref kernel.
0b26855
to
7d39426
Compare
ASSERT_EQ(ref_data[i], float(output_ptr[i])); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we must have similar test for other layouts. It would be better to reuse existing case. or can you generalize the existing case to cover this layout too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more unittests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we have similar test case that can cover this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The added unittests are copied from f16/f32 fs_b_yx_fsv32 cases. If there was any tests to cover int8/f32 fs_b_yx_fsv32 pooling cases, the tests would be failed.
If you have any cases considering, please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my suggestion is to review whether there is similar test case that exists already. If we already have similar case, we can merge this test into the existing one. Did you already review whether we have similar case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added "fs_b_yx_fsv32" format on top of existing low precision test, and deleted my previous tests.
8c7cb62
to
b78bef0
Compare
@@ -2197,7 +2197,8 @@ INSTANTIATE_TEST_SUITE_P( | |||
format::bfyx, | |||
format::b_fs_yx_fsv4, | |||
format::b_fs_yx_fsv16, | |||
format::b_fs_yx_fsv32)), | |||
format::b_fs_yx_fsv32, | |||
format::fs_b_yx_fsv32)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how many test case does this create? I'm afraid this will generate too much case for rare usage. I'd suggest to introduce separate test instantiation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated tests with separate test instantiation. 16 tests and 0.2 sec to run this.
b78bef0
to
247a48a
Compare
Details:
Tickets: