Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Frontend] OpenAI API server: Add add_special_tokens to ChatCompletionRequest (default False) #5278

Merged
merged 3 commits into from
Jun 5, 2024

Conversation

tomeras91
Copy link
Contributor

#4688 introduced a change to how messages are formatted into a prompt for the chat endpoint - the prompt is tokenized with add_special_tokens=False so a BOS token is not added. It is assumed that the chat template takes care of adding all needed special tokens.

This PR aims to make this behavior configurable instead of hardcoded. By adding add_special_tokens as a field to ChatCompletionRequest, the user can control whether a BOS token should be added or not. This is useful because not all chat templates add the BOS token.

@DarkLight1337
Copy link
Member

LGTM, thanks for making it configurable!

@simon-mo simon-mo merged commit f0a5005 into vllm-project:main Jun 5, 2024
88 of 90 checks passed
blinkbear pushed a commit to blinkbear/vllm that referenced this pull request Jun 6, 2024
robertgshaw2-neuralmagic pushed a commit to neuralmagic/nm-vllm that referenced this pull request Jun 11, 2024
joerunde pushed a commit to joerunde/vllm that referenced this pull request Jun 17, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Jun 27, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 8, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 24, 2024
@tomeras91 tomeras91 deleted the configurable-bos branch August 12, 2024 15:00
Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants