Fix missing bos token #6050

belladoreai · 2024-05-24T19:47:06Z

Expected behavior

When the UI checkbox to add bos token is checked, user expects bos token to be added.

Actual behavior

When the UI checkbox to add bos token is checked, some models add bos token, but some models don't add bos token. The (mis)configuration of the tokenizer config file affects whether bos token gets added or not.

Scope

Affects both UI users and API users (in API case tested with API parameter to add bos token rather than UI checkbox).

Affects at least model loaders which use huggingface transformers tokenizer (e.g. ExllamaV2_HF).

I checked various models I had on disk, and about half of Llama 3 models from my disk were affected by this issue. I also tried some Lllama 2 models, none of them were affected.

A few examples of affected models:

I don't know if it makes a difference, but I didn't use the downloader script to download models, I manually downloaded them.

Cause

My understanding is that Meta distributed some misconfigured tokenizer.json files and silently updated them later. Copies of the misconfigured tokenizer.json files are now circulating amongst fine tuners and quantizers.

How are the tokenizer.json files misconfigured? They have the bos token defined in some places, but not in all the places where it's needed. I don't really know what these files are supposed to contain. I looked at one llama 3 tokenizer file which appeared to be working correctly, and I found that it had this definition, which was missing from the misconfigured tokenizer files:

"special_tokens": {
          "<|begin_of_text|>": {
            "id": "<|begin_of_text|>",
            "ids": [
              128000
            ],
            "tokens": [
              "<|begin_of_text|>"
            ]
          }
        }

The way bos token is implemented in TGW is that the transformers tokenizer is called with add_special_tokens=True, so if the bos token is not defined in special_tokens, it won't be added.

Do we need a fix in TGW if it's a model issue?

We need a fix in TGW because we have UI checkbox and API parameter that set user expectations.

Also, many models quality is severely degraded in TGW if it's not fixed (even if it's technically somebody else's fault).

How to fix this

After transformers tokenizer encode, we check that the bos token is added and manually add it if it is not.

Checklist:

I have read the Contributing guidelines.

…y other models too).

Ph0rk0z · 2024-05-25T15:57:07Z

Oh shit.. is this why this was happening to me? I too observed it in some models and not others and resorted to running in verbose to double check. The BOS being missing made me think verbose wasn't outputting the entire prompt when enabled.

When I would manually add the BOS token, often times it would get removed by the code above your PR.

oobabooga · 2024-05-27T12:21:00Z

That's a very important fix, thank you.

TheLounger · 2024-05-28T04:19:23Z

This breaks everything in some situations (eg. using alfred-40B-1023-GGUF with llamacpp_HF), checkbox on or off, doesn't matter. I'm sure it's something small...

Traceback (most recent call last):
  File "/home/lounger/ai/text/webui/installer_files/env/lib/python3.11/site-packages/gradio/queueing.py", line 566, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/installer_files/env/lib/python3.11/site-packages/gradio/route_utils.py", line 261, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/installer_files/env/lib/python3.11/site-packages/gradio/blocks.py", line 1786, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/installer_files/env/lib/python3.11/site-packages/gradio/blocks.py", line 1350, in call_function
    prediction = await utils.async_iteration(iterator)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/installer_files/env/lib/python3.11/site-packages/gradio/utils.py", line 583, in async_iteration
    return await iterator.__anext__()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/installer_files/env/lib/python3.11/site-packages/gradio/utils.py", line 576, in __anext__
    return await anyio.to_thread.run_sync(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/installer_files/env/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/installer_files/env/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/installer_files/env/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/installer_files/env/lib/python3.11/site-packages/gradio/utils.py", line 559, in run_sync_iterator_async
    return next(iterator)
           ^^^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/installer_files/env/lib/python3.11/site-packages/gradio/utils.py", line 742, in gen_wrapper
    response = next(iterator)
               ^^^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/modules/chat.py", line 406, in generate_chat_reply_wrapper
    for i, history in enumerate(generate_chat_reply(text, state, regenerate, _continue, loading_message=True, for_ui=True)):
  File "/home/lounger/ai/text/webui/modules/chat.py", line 374, in generate_chat_reply
    for history in chatbot_wrapper(text, state, regenerate=regenerate, _continue=_continue, loading_message=loading_message, for_ui=for_ui):
  File "/home/lounger/ai/text/webui/modules/chat.py", line 318, in chatbot_wrapper
    prompt = generate_chat_prompt(text, state, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/modules/chat.py", line 187, in generate_chat_prompt
    encoded_length = get_encoded_length(prompt)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/modules/text_generation.py", line 189, in get_encoded_length
    return len(encode(prompt)[0])
               ^^^^^^^^^^^^^^
  File "/home/lounger/ai/text/webui/modules/text_generation.py", line 146, in encode
    bos_tensor = torch.tensor([[shared.tokenizer.bos_token_id]])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Could not infer dtype of NoneType

belladoreai · 2024-05-28T09:34:34Z

Thanks for reporting @TheLounger !

Based on your error log it seems that there are situations where bos_token_id exists but is None? I added a quick fix here: #6061

…6050)

Fix missing bos token (affecting many llama 3 exl2 quants and probabl…

d78159c

…y other models too).

Minor changes/checks

c5064d0

oobabooga merged commit a363cdf into oobabooga:dev May 27, 2024

belladoreai mentioned this pull request May 28, 2024

Fix error when bos_token_id is None #6061

Merged

1 task

PoetOnTheRun pushed a commit to PoetOnTheRun/text-generation-webui that referenced this pull request Oct 22, 2024

Fix missing bos token for some models (including Llama-3) (oobabooga#…

b927053

…6050)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix missing bos token #6050

Fix missing bos token #6050

belladoreai commented May 24, 2024 •

edited

Loading

Ph0rk0z commented May 25, 2024 •

edited

Loading

oobabooga commented May 27, 2024

TheLounger commented May 28, 2024 •

edited

Loading

belladoreai commented May 28, 2024

Fix missing bos token #6050

Fix missing bos token #6050

Conversation

belladoreai commented May 24, 2024 • edited Loading

Expected behavior

Actual behavior

Scope

Cause

Do we need a fix in TGW if it's a model issue?

How to fix this

Checklist:

Ph0rk0z commented May 25, 2024 • edited Loading

oobabooga commented May 27, 2024

TheLounger commented May 28, 2024 • edited Loading

belladoreai commented May 28, 2024

belladoreai commented May 24, 2024 •

edited

Loading

Ph0rk0z commented May 25, 2024 •

edited

Loading

TheLounger commented May 28, 2024 •

edited

Loading