Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ggggoooofs with no BOS token present, mainly qwen2 models. #6119

Merged
merged 2 commits into from
Jun 14, 2024

Conversation

Ph0rk0z
Copy link
Contributor

@Ph0rk0z Ph0rk0z commented Jun 13, 2024

Despite the bos token fix I still get this error:

  File "/home/supermicro/ai/text-generation-webui-testing/modules/models_settings.py", line 73, in get_model_metadata
    bos_token = metadata['tokenizer.ggml.tokens'][metadata['tokenizer.ggml.bos_token_id']]
                                                  ~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'tokenizer.ggml.bos_token_id'

Happens on tess qwen2 or dolphin qwen2. They are missing that key in the GGUF meta data and the models refuse to load.
There's another issue when using llama.cpp plain where it complains that it's inserting the BOS token twice. It seems l.cpp automatically interprets no bos token as being token "11" which can be "," but is probably some sort of null.

I can't imagine it's able to insert any kind of token if the "," means null.

@oobabooga
Copy link
Owner

oobabooga commented Jun 14, 2024

Do you have an example of model without the bos token defined? It's impossible to do anything without knowing what the eos/bos tokens are, the jinja template will generate a wrong output.

@Touch-Night
Copy link
Contributor

Touch-Night commented Jun 14, 2024

In qwen 2 config.json, both "bos_token_id" and "eos_token_id" is 151643(<|endoftext|>), and in tokenizer config, "bos_token" is null.
However, in dolphin qwen2 and tess qwen2's config.json, there is no bos_token_id. So for this two models, bos_token_id should be set to 151643 instead of None.
After reviewing chat templates in those 3 models, I don't understand why qwen 2 uses eos_token_id: 151643 instead of 151645(<|im_end|>), and bos_token: null instead of 151643.

@Ph0rk0z
Copy link
Contributor Author

Ph0rk0z commented Jun 14, 2024

Here is an example model link: https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-72b-gguf/tree/main

be set to 151643 instead of None

I'm not sure I agree with setting one. If we put this as a BOS token, the UI will add it to your prompts. Now was the model trained that way, or did the qwen team just fill out their configs in a lazy manner? Were the fine-tuned models trained that way? The same way for all?

And what does regular llama.cpp do? Is that token 11 really null or is it comma, what gets sent to the model? Smells like ambiguity.

@Touch-Night
Copy link
Contributor

The finetune uses the same chat template as in qwen2

@Ph0rk0z
Copy link
Contributor Author

Ph0rk0z commented Jun 14, 2024

Then they probably tune with no bos token. BOS is null and not put in the template.

@Touch-Night
Copy link
Contributor

The config in the instruction-tuned Qwen2 model is

"bos_token_id": 151643,
"eos_token_id": 151645,

though

@oobabooga
Copy link
Owner

I see that indeed qwen-2-72b doesn't have the bos token defined;

https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/tokenizer_config.json#L30

In this case, I think it can be set as "" if missing.

@oobabooga oobabooga merged commit 1576227 into oobabooga:dev Jun 14, 2024
@Touch-Night
Copy link
Contributor

I still think this should be fixed by the model creator in the upstream, not here.

@Ph0rk0z
Copy link
Contributor Author

Ph0rk0z commented Jun 15, 2024

The model creator can't fix it. Qwen has no BOS token.

@Touch-Night
Copy link
Contributor

Yes... indeed...

@Ph0rk0z Ph0rk0z deleted the patch-2 branch August 10, 2024 14:50
PoetOnTheRun pushed a commit to PoetOnTheRun/text-generation-webui that referenced this pull request Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants