question of chat templates #16

trueRosun · 2024-06-13T13:43:43Z

nice work! starred already.
sorry for asking, why replacing the bos_token with empty string?

sample['positive'] = tokenizer.apply_chat_template(
        sample['chosen'], tokenize=False, add_generation_prompt=False).replace(tokenizer.bos_token, "")
sample['negative'] = tokenizer.apply_chat_template(
    sample['rejected'], tokenize=False, add_generation_prompt=False).replace(tokenizer.bos_token, "")

The text was updated successfully, but these errors were encountered:

WeiXiongUST · 2024-06-13T13:46:33Z

Because when we service the Bradley Terry RM with pipeline, it will automatically add a bos_token inside the pipeline when tokenizing.

For pair-wise preference model, it is because we train the model without a bos_token (this is indeed some issue of llama3 at that time). But the influence of the bos token is mild in general.

trueRosun · 2024-06-13T13:56:07Z

thank you for answering!

I will further check the outputs after tokenization.

hunterlang · 2024-06-28T02:46:03Z

Because when we service the Bradley Terry RM with pipeline, it will automatically add a bos_token inside the pipeline when tokenizing.

I don't fully understand...if the inference-time pipeline adds the bos_token automatically, doesn't that mean we should train with the bos token?

WeiXiongUST · 2024-06-28T02:53:56Z

Because when we service the Bradley Terry RM with pipeline, it will automatically add a bos_token inside the pipeline when tokenizing.

I don't fully understand...if the inference-time pipeline adds the bos_token automatically, doesn't that mean we should train with the bos token?

Yes, you are correct. Unfortunately, when we train the model, there is a bug in the llama3 tokenizer so the model is trained WITHOUT bos token.

We have tested with/without bos, it can lead to ~1% difference in the reward bench accuracy. You may modify the tokenizer to prevent the tokenizer from adding a bos token automatically to fix the issue I guess...

hunterlang · 2024-06-28T05:04:43Z

Thanks for the reply! Just to clarify:

If I remove those .replace(tokenizer.bos_token, "") calls, then training should match inference, because the inference pipeline adds BOS automatically.

If I modify the tokenizer, then the inference pipeline will match the off-the-shelf models you already released, which were trained without BOS?

WeiXiongUST · 2024-06-28T06:05:48Z

We get a bos token when we tokenize by apply chat template. Then, inside the pipeline, we get another bos token.

If you remove .replace(tokenizer.bos_token, ""), you still get one bos token inside the pipeline. If you do not remove .replace(tokenizer.bos_token, ""), you will get two bos tokens.

If we modify the tokenizer to avoid it to add bos token, then we will never get bos token. Then, it matches the training (no bos token).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question of chat templates #16

question of chat templates #16

trueRosun commented Jun 13, 2024

WeiXiongUST commented Jun 13, 2024

trueRosun commented Jun 13, 2024

hunterlang commented Jun 28, 2024

WeiXiongUST commented Jun 28, 2024

hunterlang commented Jun 28, 2024

WeiXiongUST commented Jun 28, 2024

question of chat templates #16

question of chat templates #16

Comments

trueRosun commented Jun 13, 2024

WeiXiongUST commented Jun 13, 2024

trueRosun commented Jun 13, 2024

hunterlang commented Jun 28, 2024

WeiXiongUST commented Jun 28, 2024

hunterlang commented Jun 28, 2024

WeiXiongUST commented Jun 28, 2024