-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
question of chat templates #16
Comments
Because when we service the Bradley Terry RM with pipeline, it will automatically add a bos_token inside the pipeline when tokenizing. For pair-wise preference model, it is because we train the model without a bos_token (this is indeed some issue of llama3 at that time). But the influence of the bos token is mild in general. |
thank you for answering! I will further check the outputs after tokenization. |
I don't fully understand...if the inference-time pipeline adds the bos_token automatically, doesn't that mean we should train with the bos token? |
Yes, you are correct. Unfortunately, when we train the model, there is a bug in the llama3 tokenizer so the model is trained WITHOUT bos token. We have tested with/without bos, it can lead to ~1% difference in the reward bench accuracy. You may modify the tokenizer to prevent the tokenizer from adding a bos token automatically to fix the issue I guess... |
Thanks for the reply! Just to clarify: If I remove those If I modify the tokenizer, then the inference pipeline will match the off-the-shelf models you already released, which were trained without BOS? |
We get a bos token when we tokenize by apply chat template. Then, inside the pipeline, we get another bos token. If you remove .replace(tokenizer.bos_token, ""), you still get one bos token inside the pipeline. If you do not remove .replace(tokenizer.bos_token, ""), you will get two bos tokens. If we modify the tokenizer to avoid it to add bos token, then we will never get bos token. Then, it matches the training (no bos token). |
nice work! starred already.
sorry for asking, why replacing the bos_token with empty string?
The text was updated successfully, but these errors were encountered: