-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix integer overflows in attention & cache ops #1514
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Left a small comment about one potential overflow point.
+ head_offset * block_size | ||
+ block_offset; | ||
const int64_t tgt_key_idx = block_idx * num_heads * (head_size / x) * block_size * x | ||
+ head_idx * (head_size / x) * block_size * x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the numbers in this expression is int
: head_idx * (head_size / x) * block_size * x
. For safety, should we change all numbers to int64_t
? I think this should only bring negligible overhead but I'm not 100% sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ok because block_idx
is int64_t
. If I understand correctly, the only expression that can cause overflow is block_idx * num_heads * (head_size / x) * block_size * x
. Since block_idx
is int64_t
, this expression and the subsequent additions to this value are all evaluated as int64_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhuohan123 Does this relieve your concern?
Thanks for the quick fix!!! |
Fixes #1486
This PR fixes the overflows in paged attention & cache ops when the number of blocks is huge.