Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: mistral nemo does not recognize token_type_ids in forward #2233

Merged
merged 1 commit into from
Jan 9, 2025

Conversation

NanoCode012
Copy link
Collaborator

Description

Closes #2225

The issue does not appear for mistral 7b v03 even though they used same source model type mistral.

Mistral Nemo uses PreTrainedTokenizerFast for tokenizer class which returns token_type_ids https://github.com/huggingface/transformers/blob/42865860ec6dc135972d9555753cb7ee17f51fb4/src/transformers/tokenization_utils_base.py#L1397 whereas mistral 7b 03 uses LlamaTokenizer which doesn't https://github.com/huggingface/transformers/blob/42865860ec6dc135972d9555753cb7ee17f51fb4/src/transformers/models/llama/tokenization_llama.py#L128

A more future proof method could be following LlamaFactory where they check the .forward signature of the model and drop token_type_ids if not found

Motivation and Context

How has this been tested?

Confirmed fixes mistral Nemo for packing.

The issue did not appear without packing from limited testing.

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

@winglian winglian added this pull request to the merge queue Jan 9, 2025
Merged via the queue into main with commit 2e8d7c1 Jan 9, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mistral Nemo 12B Completion Training Fails from unexpected keyword argument 'token_type_ids'
2 participants