Add document for vllm paged attention kernel. #2978

pian13131 · 2024-02-22T04:47:16Z

Hello, I am currently studying the vLLM paged attention kernel, and I've found that the implementation can be quite complex for newcomers. After thoroughly reviewing the primary implementation of the kernel in csrc/attention/attention_kernels.cu, I have composed this document to provide a high-level understanding of the paged attention kernel. The document covers explanations on memory layout, read patterns, and step-by-step calculations, accompanied by diagrams and pseudo-code. It is intended to serve as a valuable reference for individuals interested in the implementation of the paged attention kernel.

Given that I am still a novice in this subject, there may be some misunderstandings in the document. I welcome any comments and advice to enhance its accuracy and clarity. Your feedback is highly appreciated! Wish this document can be merged to help other people!

simon-mo · 2024-02-22T17:57:31Z

For reviewers, rendered here: https://vllm--2978.org.readthedocs.build/en/2978/dev/kernel/paged_attention.html

simon-mo · 2024-02-22T17:57:38Z

Thank you for this great write up!

zhaoyang-star · 2024-02-28T09:24:15Z

This doc is very usefull. Hope to be merged soon.

docs/source/dev/kernel/paged_attention.rst

LiuXiaoxuanPKU

Looks great! Thanks! Some minor comments.

docs/source/dev/kernel/paged_attention.rst

WoosukKwon · 2024-03-04T18:52:59Z

@pian13131 This is AWESOME! Thanks for your contribution!

pian13131 and others added 3 commits February 21, 2024 20:07

Add pa doc

157e0c0

fix

b6d5b36

fix

c448a31

simon-mo requested review from pcmoritz, zhuohan123 and WoosukKwon February 22, 2024 17:57

esmeetu added the documentation Improvements or additions to documentation label Mar 2, 2024

LiuXiaoxuanPKU reviewed Mar 3, 2024

View reviewed changes

docs/source/dev/kernel/paged_attention.rst Outdated Show resolved Hide resolved

LiuXiaoxuanPKU reviewed Mar 3, 2024

View reviewed changes

docs/source/dev/kernel/paged_attention.rst Outdated Show resolved Hide resolved

LiuXiaoxuanPKU reviewed Mar 3, 2024

View reviewed changes

docs/source/dev/kernel/paged_attention.rst Show resolved Hide resolved

LiuXiaoxuanPKU approved these changes Mar 4, 2024

View reviewed changes

pian13131 and others added 2 commits March 3, 2024 20:27

fix comments.

4278cea

Merge branch 'vllm-project:main' into doc

1d45f73

LiuXiaoxuanPKU merged commit 27a7b07 into vllm-project:main Mar 4, 2024
22 checks passed

dtransposed pushed a commit to afeldman-nm/vllm that referenced this pull request Mar 26, 2024

Add document for vllm paged attention kernel. (vllm-project#2978)

9b0a484

Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024

Add document for vllm paged attention kernel. (vllm-project#2978)

7f2485f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add document for vllm paged attention kernel. #2978

Add document for vllm paged attention kernel. #2978

pian13131 commented Feb 22, 2024

simon-mo commented Feb 22, 2024

simon-mo commented Feb 22, 2024

zhaoyang-star commented Feb 28, 2024

LiuXiaoxuanPKU left a comment

WoosukKwon commented Mar 4, 2024

Add document for vllm paged attention kernel. #2978

Add document for vllm paged attention kernel. #2978

Conversation

pian13131 commented Feb 22, 2024

simon-mo commented Feb 22, 2024

simon-mo commented Feb 22, 2024

zhaoyang-star commented Feb 28, 2024

LiuXiaoxuanPKU left a comment

Choose a reason for hiding this comment

WoosukKwon commented Mar 4, 2024