-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ggggoooofs with no BOS token present, mainly qwen2 models. #6119
Conversation
Do you have an example of model without the bos token defined? It's impossible to do anything without knowing what the eos/bos tokens are, the jinja template will generate a wrong output. |
In qwen 2 config.json, both "bos_token_id" and "eos_token_id" is 151643(<|endoftext|>), and in tokenizer config, "bos_token" is null. |
Here is an example model link: https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-72b-gguf/tree/main
I'm not sure I agree with setting one. If we put this as a BOS token, the UI will add it to your prompts. Now was the model trained that way, or did the qwen team just fill out their configs in a lazy manner? Were the fine-tuned models trained that way? The same way for all? And what does regular llama.cpp do? Is that token 11 really null or is it comma, what gets sent to the model? Smells like ambiguity. |
The finetune uses the same chat template as in qwen2 |
Then they probably tune with no bos token. BOS is null and not put in the template. |
The config in the instruction-tuned Qwen2 model is
though |
I see that indeed qwen-2-72b doesn't have the bos token defined; https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/tokenizer_config.json#L30 In this case, I think it can be set as "" if missing. |
I still think this should be fixed by the model creator in the upstream, not here. |
The model creator can't fix it. Qwen has no BOS token. |
Yes... indeed... |
…6119) --------- Co-authored-by: oobabooga <[email protected]>
Despite the bos token fix I still get this error:
Happens on tess qwen2 or dolphin qwen2. They are missing that key in the GGUF meta data and the models refuse to load.
There's another issue when using llama.cpp plain where it complains that it's inserting the BOS token twice. It seems l.cpp automatically interprets no bos token as being token "11" which can be "," but is probably some sort of null.
I can't imagine it's able to insert any kind of token if the "," means null.