-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add logits processors to enable logit_bias in OpenAI server #535
Add logits processors to enable logit_bias in OpenAI server #535
Conversation
|
||
if logits_processors is not None: | ||
for logits_processor in logits_processors: | ||
logits = logits_processor(logits, output_tokens) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this call, you send output_tokens
to logits_processor()
. However, in the LogitsProcessor
interface, the output_tokens
parameter does not exist:
def __call__(self, logits: torch.tensor) -> torch.tensor:
How does it work?
Thank you for your PR! Now #1469 is merged with logits_processors API, can you help rebase and use that instead? The |
@zacharyblank Any update? This will be greate for advanced users. vllm/outlines is nice but nothing beats pure code based state machine when it comes to performance. |
Thanks for your contribution. Close this PR as this feature has been supported in #3027. |
Co-authored-by: sang <[email protected]>
This PR makes it possible to define custom logits processors to alter the probability of token generation based on user defined code. This allows for the vLLM OpenAI server to accept requests with
logit_bias
too.The BiasLogitsProcessor is included specifically for the OpenAI server to handle requests with
logit_bias
.