-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CPU] Add RMSNorm jit implementation #26147
[CPU] Add RMSNorm jit implementation #26147
Conversation
eb61910
to
7167a1b
Compare
7167a1b
to
e700e54
Compare
c8fdd42
to
fcc8de0
Compare
fcc8de0
to
08bbaac
Compare
src/common/transformations/src/transformations/common_optimizations/rms_fusion.cpp
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great work!
@@ -67,8 +67,11 @@ RMSFusion::RMSFusion() { | |||
auto gamma = wrap_type<ov::op::v0::Constant>(type_matches(element::f32)); | |||
auto mul2 = wrap_type<ov::op::v1::Multiply>({gamma, mul1}); | |||
|
|||
// compress RMS result | |||
auto comp = wrap_type<ov::op::v0::Convert>({mul2}); | |||
std::shared_ptr<ov::Node> comp = mul2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a requirement for this PR, but still worth to mention. ConvertPrecision(FP32->FP16) pass keeps normalization subgraph in higher precision to maintain the accuracy, which results in additinal Convert op (FP32->Fp16) in the end of pattern. Ideally even for CPU (with fp16 infer prec) we will need to fuse such Conversion into the RMS for better performance. Basically two possible solutions:
- Match two different patterns (with and w/o Convert) in boumds of RMSFusion transformation
- Fuse RMS+Convert in separate transformation (ideally it should be generic pass suitable for any parent op)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to implement this test as custom for CPU plugin? I don't see any plugin specific details in it.
Logically it should be shared test with device specific instances for CPU and GPU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/x64/rms_norm.cpp
Outdated
Show resolved
Hide resolved
### Changes as stated in the title This PR introduced changes in accuracy: openvinotoolkit/openvino#26147 but it's coming from float error: ![image](https://github.com/user-attachments/assets/e819a14b-3c74-4568-beef-7b43737128a6) ### Reason for changes As result validation with openvino nightly failed: ![image](https://github.com/user-attachments/assets/23fe22dd-6ec2-4577-98df-f26732c7962a) ### Related tickets 150613 151260 ### Tests - [x] job/NNCF/job/manual/job/post_training_weight_compression/163/
### Changes as stated in the title This PR introduced changes in accuracy: openvinotoolkit/openvino#26147, but it's coming from float error: ![image](https://github.com/user-attachments/assets/e819a14b-3c74-4568-beef-7b43737128a6) ### Reason for changes As result validation with openvino nightly failed: ![image](https://github.com/user-attachments/assets/23fe22dd-6ec2-4577-98df-f26732c7962a) ### Related tickets 150613 151260 ### Tests - [x] job/NNCF/job/manual/job/post_training_weight_compression/163/
…ts (#3065) ### Changes Revert of #2954 Use [references for the OV 2024.3](https://github.com/ljaljushkin/nncf_pytorch/blob/873fbb80ab07bd41a3b1a2e21f68bc1a7a394810/tests/post_training/data/wc_reference_data.yaml) when run conformance tests with OV 2024.5. ### Reason for changes According to the CPU team: > 1) In 24.4 time frame, RMS impl has been enabled (openvinotoolkit/openvino#26147), which introduce some accuracy change due to vrsqrtss which is done by approximation. > 2) In 24.5 time frame, fix has been applied (openvinotoolkit/openvino#26817) to use vsqrtss instead. With that we can see the result can be covered to that prior to openvinotoolkit/openvino#26147 > With that (in 24.5), the accuracy result is recovered to that prior to RMS impl openvinotoolkit/openvino#26147. Let's say that 24.4 may has some accuracy issue in some case. 24.5 fix that issue. ### Related tickets 156605 156776 156237 151260 ### Tests - [x] 47 build of weekly/job/openvino-nightly/job/post_training_weight_compression ![image](https://github.com/user-attachments/assets/10271e74-4e0b-4506-917a-306481b500e3) - [x] 48 build for xfail tests ![image](https://github.com/user-attachments/assets/dc5a9424-35fa-4015-965a-ca08fa46e4b7)
…ts (openvinotoolkit#3065) ### Changes Revert of openvinotoolkit#2954 Use [references for the OV 2024.3](https://github.com/ljaljushkin/nncf_pytorch/blob/873fbb80ab07bd41a3b1a2e21f68bc1a7a394810/tests/post_training/data/wc_reference_data.yaml) when run conformance tests with OV 2024.5. ### Reason for changes According to the CPU team: > 1) In 24.4 time frame, RMS impl has been enabled (openvinotoolkit/openvino#26147), which introduce some accuracy change due to vrsqrtss which is done by approximation. > 2) In 24.5 time frame, fix has been applied (openvinotoolkit/openvino#26817) to use vsqrtss instead. With that we can see the result can be covered to that prior to openvinotoolkit/openvino#26147 > With that (in 24.5), the accuracy result is recovered to that prior to RMS impl openvinotoolkit/openvino#26147. Let's say that 24.4 may has some accuracy issue in some case. 24.5 fix that issue. ### Related tickets 156605 156776 156237 151260 ### Tests - [x] 47 build of weekly/job/openvino-nightly/job/post_training_weight_compression ![image](https://github.com/user-attachments/assets/10271e74-4e0b-4506-917a-306481b500e3) - [x] 48 build for xfail tests ![image](https://github.com/user-attachments/assets/dc5a9424-35fa-4015-965a-ca08fa46e4b7)
Details:
Tickets: