forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[BugFix][Frontend] Use LoRA tokenizer in OpenAI APIs
Currently the LoRA tokenizers aren't used in the OpenAI APIs, meaning the behaviour won't be correct if adapters are used that have custom added tokens. This PR includes changes to address that. It mostly replaces vllm-project#3512. More work is needed to address remaining inconsistencies in tokenization behaviour between the OpenAI front-end and standalone LLMEngine/AsyncLLMEngine use, including: - Standalone cases don't honor truncation and add_special_tokens request parameters - OpenAI API cases don't make use of TokenizerGroups for possible parallelization of tokenization As well as some other inefficiencies. But these are to be addressed in follow-on PRs.
- Loading branch information
Showing
7 changed files
with
144 additions
and
116 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.