You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When working on tokenizer result for llama-2-7b-chat-hf model, I noticed that the prompt_token_ids generated in this place would generate an extra token <s> in the beginning of the sentence.
For example for the follow prompt <s>[INST] what is the color of the snow? [/INST] , hf tokenizer can directly tokenize it to
When working on tokenizer result for
llama-2-7b-chat-hf
model, I noticed that theprompt_token_ids
generated in this place would generate an extra token<s>
in the beginning of the sentence.For example for the follow prompt
<s>[INST] what is the color of the snow? [/INST]
, hf tokenizer can directly tokenize it tobut for the very same prompt vllm would generate tokenized prompt ids as follows
which has an extra token
1
, aka<s>
in the beginning.Looking forward to have someone help me confirm if this is designated behaviour or caused by some of the model options.
The text was updated successfully, but these errors were encountered: