Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update logprob for prefill step #1

Conversation

vvchernov
Copy link
Collaborator

See PR for details

Copy link
Owner

@zxybazh zxybazh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi thanks for sending in the PR! The mlc serve part fix looks pretty good to me. On the relax model part, would you please explain a bit more about all_logits and the use of last_logits to help me understand the usage in logprobs?

@vvchernov
Copy link
Collaborator Author

I think some misunderstanding appeared due to it has not been done yet.
The main idea is that we should output all logits for logprobs from prefill step, but initially it is always cut on topology side.
Two practice scenarios I see for utilization of it: 1. loglikelihood approach of accuracy benchmark 2. speculative decoding

Definitions:
last_logits is logits of new token (last token in the sequence). It is that output now without fix
logits is all logits of token sequence, they are output only with all_logits key

I see two options how it can be implemented, both have its pons and cons:

  1. Return logits instead of last_logits. It is back-compatible, but we need to follow and cut last logits (token) on cpu side outside of topology which potentially leads to performance degradation
  2. My current implementation requires to redo it for all relax models (no back-compatible)

Copy link
Owner

@zxybazh zxybazh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the quick update over OpenAI's update.

@zxybazh zxybazh merged commit b64db01 into zxybazh:feature/2023-11-22/enable-mlc-server-logprobs Dec 19, 2023
@vvchernov vvchernov deleted the vc/prefill_logprob branch December 20, 2023 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants