Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PagedAttention experimental operation #23837

Merged

Conversation

slyalin
Copy link
Contributor

@slyalin slyalin commented Apr 3, 2024

PagedAttention operation exposed in Python API only for easier vLLM openvino integration. It is not intended to be used outside our integration work in vLLM and similar applications where we can use PagedAttention. Exposed as a hidden part of API, will not be documented. Connected to already existing implementation in CPU plugin.

Operation is not a part of any public opset.

@slyalin slyalin requested a review from a team as a code owner April 3, 2024 11:04
@github-actions github-actions bot added the category: Python API OpenVINO Python bindings label Apr 3, 2024
@moslex moslex added this to the 2024.1 milestone Apr 3, 2024
@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Apr 4, 2024
@akuporos akuporos requested a review from mitruska April 4, 2024 07:58
Merged via the queue into openvinotoolkit:master with commit 492699d Apr 4, 2024
108 checks passed
bbielawx pushed a commit to bbielawx/openvino that referenced this pull request Apr 12, 2024
PagedAttention operation exposed in Python API only for easier vLLM
openvino integration. It is not intended to be used outside our
integration work in vLLM and similar applications where we can use
PagedAttention. Exposed as a hidden part of API, will not be documented.
Connected to already existing implementation in CPU plugin.

Operation is not a part of any public opset.
alvoron pushed a commit to alvoron/openvino that referenced this pull request Apr 29, 2024
PagedAttention operation exposed in Python API only for easier vLLM
openvino integration. It is not intended to be used outside our
integration work in vLLM and similar applications where we can use
PagedAttention. Exposed as a hidden part of API, will not be documented.
Connected to already existing implementation in CPU plugin.

Operation is not a part of any public opset.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Python API OpenVINO Python bindings Code Freeze
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants