Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROCm] Fix warp and lane calculation in blockReduceSum #3321

Merged
merged 2 commits into from
Mar 11, 2024

Conversation

kliuae
Copy link
Contributor

@kliuae kliuae commented Mar 11, 2024

The fix in #3262 has addressed the issue of blockReduceSum using 32-thread warps in architecture supporting 64 as warp size. However, discrepancies were found in generation results before and after the fix, and the layernorm unit test unfortunately didn't pass on ROCm.

This PR adds fixes to these issues. It's been tested on MI210 and passes the layernorm unit test.

Copy link
Contributor

@dllehr-amd dllehr-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on these. This is correctly calculating the lane and wid.

Copy link
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kliuae Thanks for submitting the PR!
@dllehr-amd Thanks for the review!

@WoosukKwon WoosukKwon merged commit c9415c1 into vllm-project:main Mar 11, 2024
24 checks passed
starmpcc pushed a commit to starmpcc/vllm that referenced this pull request Mar 14, 2024
dtransposed pushed a commit to afeldman-nm/vllm that referenced this pull request Mar 26, 2024
Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants