Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add document for vllm paged attention kernel. #2978

Merged
merged 5 commits into from
Mar 4, 2024

Conversation

pian13131
Copy link
Contributor

Hello, I am currently studying the vLLM paged attention kernel, and I've found that the implementation can be quite complex for newcomers. After thoroughly reviewing the primary implementation of the kernel in csrc/attention/attention_kernels.cu, I have composed this document to provide a high-level understanding of the paged attention kernel. The document covers explanations on memory layout, read patterns, and step-by-step calculations, accompanied by diagrams and pseudo-code. It is intended to serve as a valuable reference for individuals interested in the implementation of the paged attention kernel.

Given that I am still a novice in this subject, there may be some misunderstandings in the document. I welcome any comments and advice to enhance its accuracy and clarity. Your feedback is highly appreciated! Wish this document can be merged to help other people!

@simon-mo
Copy link
Collaborator

@simon-mo
Copy link
Collaborator

Thank you for this great write up!

@zhaoyang-star
Copy link
Contributor

This doc is very usefull. Hope to be merged soon.

@esmeetu esmeetu added the documentation Improvements or additions to documentation label Mar 2, 2024
Copy link
Collaborator

@LiuXiaoxuanPKU LiuXiaoxuanPKU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks! Some minor comments.

docs/source/dev/kernel/paged_attention.rst Outdated Show resolved Hide resolved
docs/source/dev/kernel/paged_attention.rst Show resolved Hide resolved
docs/source/dev/kernel/paged_attention.rst Outdated Show resolved Hide resolved
docs/source/dev/kernel/paged_attention.rst Outdated Show resolved Hide resolved
docs/source/dev/kernel/paged_attention.rst Outdated Show resolved Hide resolved
@LiuXiaoxuanPKU LiuXiaoxuanPKU merged commit 27a7b07 into vllm-project:main Mar 4, 2024
22 checks passed
@WoosukKwon
Copy link
Collaborator

@pian13131 This is AWESOME! Thanks for your contribution!

dtransposed pushed a commit to afeldman-nm/vllm that referenced this pull request Mar 26, 2024
Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants