Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Commit

Permalink
[Misc] fix docstrings (vllm-project#4191)
Browse files Browse the repository at this point in the history
Co-authored-by: Zhong Wang <[email protected]>
  • Loading branch information
2 people authored and robertgshaw2-neuralmagic committed Apr 21, 2024
1 parent 50e1d90 commit 58dbe5f
Showing 1 changed file with 3 additions and 6 deletions.
9 changes: 3 additions & 6 deletions vllm/sequence.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ def reset_state_for_recompute(self) -> None:
self._stage = SequenceStage.PREFILL

def get_num_uncomputed_tokens(self) -> int:
"""Return the number of prefil tokens that are not computed."""
"""Return the number of prefill tokens that are not computed."""
# we use `get_len()` which includes prompt_len + output_len instead
# of prompt_len here. This is because during recompute we need to
# prefill for both prompt and output.
Expand Down Expand Up @@ -345,12 +345,9 @@ def fork(self, new_seq_id: int) -> "Sequence":
def get_num_new_tokens(self) -> int:
"""Get the number of new tokens to be computed.
Args:
remainig_token_budget: The remaining token budgets.
Returns:
The new number of tokens to be computed. I.e., 1 for decode, prompt
size for prefill. If there's not enough remainig_token_budget, it
can return the chunked number of new tokens.
The new number of tokens to be computed. I.e., 1 for decode, or
the remaining prompt size for prefill.
"""
if self.data.stage == SequenceStage.DECODE:
return 1
Expand Down

0 comments on commit 58dbe5f

Please sign in to comment.