Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix integer overflows in attention & cache ops #1514

Merged
merged 3 commits into from
Oct 31, 2023
Merged

Fix integer overflows in attention & cache ops #1514

merged 3 commits into from
Oct 31, 2023

Conversation

WoosukKwon
Copy link
Collaborator

Fixes #1486

This PR fixes the overflows in paged attention & cache ops when the number of blocks is huge.

Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left a small comment about one potential overflow point.

+ head_offset * block_size
+ block_offset;
const int64_t tgt_key_idx = block_idx * num_heads * (head_size / x) * block_size * x
+ head_idx * (head_size / x) * block_size * x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the numbers in this expression is int: head_idx * (head_size / x) * block_size * x. For safety, should we change all numbers to int64_t? I think this should only bring negligible overhead but I'm not 100% sure.

Copy link
Collaborator Author

@WoosukKwon WoosukKwon Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok because block_idx is int64_t. If I understand correctly, the only expression that can cause overflow is block_idx * num_heads * (head_size / x) * block_size * x. Since block_idx is int64_t, this expression and the subsequent additions to this value are all evaluated as int64_t.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhuohan123 Does this relieve your concern?

@WoosukKwon WoosukKwon merged commit 0ce8647 into main Oct 31, 2023
2 checks passed
@WoosukKwon WoosukKwon deleted the overflow branch October 31, 2023 22:19
@JF-D
Copy link
Contributor

JF-D commented Nov 1, 2023

Thanks for the quick fix!!!

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
sjchoi1 pushed a commit to casys-kaist-internal/vllm that referenced this pull request May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Array Index Overflow for Large #Blocks
3 participants