-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU] Force sdpa use onednn path for prefill and cl path for generation. #27387
[GPU] Force sdpa use onednn path for prefill and cl path for generation. #27387
Conversation
src/plugins/intel_gpu/src/graph/impls/ocl/scaled_dot_product_attention.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, left minor comments
src/plugins/intel_gpu/src/graph/impls/ocl/scaled_dot_product_attention.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_gpu/src/graph/impls/ocl/scaled_dot_product_attention.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_gpu/src/graph/impls/ocl/scaled_dot_product_attention.cpp
Outdated
Show resolved
Hide resolved
…ttention.cpp Co-authored-by: Sergey Shlyapnikov <[email protected]>
…ttention.cpp Co-authored-by: Sergey Shlyapnikov <[email protected]>
@ceciliapeng2011 LGTM, but is this only effective on ARLH? How about LNL? |
#27889) …on... the backport of pull/27387. ### Details: - *[GPU] Force SDPA use oneDNN path for prefill and clDNN path for generation on ARL-H platform* - *backport of [pull/27387](#27387 ### Tickets: - *[CVS-158461](https://jira.devtools.intel.com/browse/CVS-158461)*
Yes, it is only for ARL-H. We've also benchmarked with LNL and ARC, no strong justification for them... the performance differentiates among shape and models. |
Details:
Tickets: