Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU] gemm_tile supports block read when leftover with 4byte-size align in dynamic. #24535

Merged
merged 4 commits into from
May 22, 2024

Conversation

hyunback
Copy link
Contributor

Tickets:

  • 141032

@hyunback hyunback added category: GPU OpenVINO GPU plugin WIP work in progress labels May 16, 2024
@hyunback hyunback requested review from a team as code owners May 16, 2024 03:00
@hyunback hyunback removed the WIP work in progress label May 17, 2024
@e-ddykim e-ddykim added this to the 2024.2 milestone May 20, 2024
Because of additional mul op, some regression was occured in lln.
Minimize calculating 4byte aligned check using leftover constant.

Signed-off-by: hyunback <[email protected]>
@@ -281,13 +281,13 @@ KERNEL(gemm_tiled_opt)(
#if B_VEC_SIZE == 1
b_tile[b_load_id] = b_raw_global_id > N - 1 ? 0 : b_ptr[sglid];
#else // B_VEC_SIZE == 1
#if TILE_N_NOT_DIVISIBLE
if (TILE_N_NOT_DIVISIBLE == 0 || N_IS_ALIGNED_4BYTE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about pre-calculate this somewhere before the loops?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But lets apply this in the next time.

@yeonbok yeonbok added this pull request to the merge queue May 22, 2024
Merged via the queue into openvinotoolkit:master with commit 7f4e766 May 22, 2024
100 checks passed
@hyunback hyunback deleted the gemm_tiled_blockread_dynamic branch May 24, 2024 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GPU OpenVINO GPU plugin Code Freeze
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants