Skip to content

Commit

Permalink
[Core] latency optimization (vllm-project#3890)
Browse files Browse the repository at this point in the history
  • Loading branch information
youkaichao authored and joerunde committed Apr 11, 2024
1 parent d0bc197 commit 9d9b6c4
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion vllm/core/block_manager_v1.py
Original file line number Diff line number Diff line change
Expand Up @@ -328,7 +328,7 @@ def _is_last_block_full(
self,
seq: Sequence,
) -> bool:
token_ids_len = len(seq.data.get_token_ids())
token_ids_len = seq.data.get_len()
return token_ids_len > 0 and token_ids_len % seq.block_size == 0

def _maybe_promote_last_block(
Expand Down

0 comments on commit 9d9b6c4

Please sign in to comment.