Add Llama 3.1 Instruct Chat template, Ensure Correctness of 3, Small Refactor of Chat Template registration for shareGPT #1903

alpayariyak · 2024-09-07T01:40:59Z

Description

Add Llama 3.1 Instruct Chat template, which mainly differs in these ways:
- it always has system prompt regardless of whether the conversation does
- In the system message there's always a knowledge cutoff and today's date, user system prompt content is added after it
To use it, specify chat_template: llama31 in the config.
Ensure correctness of Llama 3 Chat template by removing the default assignment You are a helpful assistant system message.
Refactor some redundancy out

Motivation and Context

Important for those who want to fine-tune the instruct versions, as well as anyone who just wants to have the same prompt template as instruct versions, even if they are fine-tuning from base.

How has this been tested?

Preprocessing combinations of the following 2 variables in a multi-turn setting:

using llama3 vs llama31
dataset with system prompts vs without
With the debug flag, making sure that the loss mask is correct and that the tokens match using tokenizer.apply_chat_template

Screenshots (if appropriate)

Example Llama 3.1 tokenized output:

When there is a system message in the conversation
When there isn't

winglian · 2024-09-10T14:40:28Z

@alpayariyak pushed up some fixes to this branch. should probably re-test the functionality

alpayariyak · 2024-09-11T00:59:30Z

Just re-tested, looks good!

alpayariyak · 2024-09-27T18:30:48Z

There's a weird bug - if assistant content is only 1 token, it's masked out

…ors for Chat Templates

alpayariyak and others added 3 commits October 14, 2024 15:42

Add Llama 3.1 Instruct Chat template, Ensure Correctness of 3, Refact…

ff8db56

…ors for Chat Templates

lint, fix templating, default version

ab41470

ensure the default is llama3

1f5cd4e

winglian force-pushed the llama31_chat branch from 9bf8a20 to 1f5cd4e Compare October 14, 2024 19:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Llama 3.1 Instruct Chat template, Ensure Correctness of 3, Small Refactor of Chat Template registration for shareGPT #1903

Add Llama 3.1 Instruct Chat template, Ensure Correctness of 3, Small Refactor of Chat Template registration for shareGPT #1903

alpayariyak commented Sep 7, 2024

winglian commented Sep 10, 2024

alpayariyak commented Sep 11, 2024

alpayariyak commented Sep 27, 2024

Add Llama 3.1 Instruct Chat template, Ensure Correctness of 3, Small Refactor of Chat Template registration for shareGPT #1903

Are you sure you want to change the base?

Add Llama 3.1 Instruct Chat template, Ensure Correctness of 3, Small Refactor of Chat Template registration for shareGPT #1903

Conversation

alpayariyak commented Sep 7, 2024

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

winglian commented Sep 10, 2024

alpayariyak commented Sep 11, 2024

alpayariyak commented Sep 27, 2024