Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enables GQA support in the prefix prefill kernels #3007

Merged
merged 1 commit into from
Feb 27, 2024

Conversation

sighingnow
Copy link
Contributor

No description provided.

@sighingnow
Copy link
Contributor Author

sighingnow commented Feb 23, 2024

The failure in CI's "Model Test" shouldn't be caused by this pull request, and I have noticed the same failure in other PR as well as main.

Copy link
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sighingnow Awesome! Thanks for submitting the PR! Left a minor comment on a variable name.

@WoosukKwon
Copy link
Collaborator

@sighingnow Thanks for the fix! I will merge the PR once it passes the CI tests.

@sighingnow
Copy link
Contributor Author

@sighingnow Thanks for the fix! I will merge the PR once it passes the CI tests.

Thank you!

@sighingnow
Copy link
Contributor Author

@sighingnow Thanks for the fix! I will merge the PR once it passes the CI tests.

Hi @WoosukKwon, CI turns green now. (Just a polite reminding).

@WoosukKwon WoosukKwon merged commit 71bcaf9 into vllm-project:main Feb 27, 2024
21 checks passed
@sighingnow sighingnow deleted the ht/prefix-gqa branch February 27, 2024 14:06
xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 4, 2024
@@ -17,12 +18,14 @@


@pytest.mark.parametrize("num_heads", NUM_HEADS)
@pytest.mark.parametrize("num_queries_per_kv", NUM_HEADS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this parameter be NUM_QUERIES_PER_KV?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #3246.

Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants