Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Adding token ranks along with logprobs #3516

Merged
merged 6 commits into from
Mar 25, 2024
Merged

[Core] Adding token ranks along with logprobs #3516

merged 6 commits into from
Mar 25, 2024

Conversation

SwapnilDreams100
Copy link
Contributor

Adds the token rank functionality (for both prompt tokens and sample tokens) within the Logprobs object, by adding the rank property to Logprobs.
This PR also adds a couple of tests to verify the ranks in greedy and non-greedy settings.

The implementation here is meant to align with functionality in https://github.com/IBM/text-generation-inference (IBM's fork of HF TGI).

@SwapnilDreams100 SwapnilDreams100 changed the title feat: adding token ranks along with logprobs [CORE] Adding token ranks along with logprobs Mar 20, 2024
@SwapnilDreams100 SwapnilDreams100 changed the title [CORE] Adding token ranks along with logprobs [Core] Adding token ranks along with logprobs Mar 20, 2024
@@ -447,6 +447,9 @@ def _sample(
]
return sample_results

def _get_ranks(x: torch.Tensor, indices: List[int]) -> torch.Tensor:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add docstring to this func? Like the shape of x tensors, and what exactly indices mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do!

@@ -17,9 +17,9 @@
class Logprob:
"""Infos for supporting OpenAI compatible logprobs."""
logprob: float
rank: Optional[int] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment what this means here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do!

@Yard1 Yard1 self-requested a review March 20, 2024 03:03
Copy link
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 511 to 513
batched_ranks_query_result = _get_ranks(
logprobs[batched_logprobs_query_seq_indices],
batched_logprobs_query_token_indices)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move this after line 526, since we'll have to force a CPU<->GPU sync here anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do!

Copy link
Collaborator

@Yard1 Yard1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, please move the invocation as outlined in the comment before merge!

@njhill
Copy link
Member

njhill commented Mar 20, 2024

One question I guess is whether these should also be exposed in the openai API responses (but doesn't necessarily need to be addressed in this PR).

@simon-mo simon-mo merged commit 819924e into vllm-project:main Mar 25, 2024
32 checks passed
xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 31, 2024
Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants