Fix jetmoe model #31279

Cyrilvallez · 2024-06-06T09:03:42Z

What does this PR do?

Fixes #31266
cc @ArthurZucker @amyeroberts

I am sorry I did not notice that JetMoePreTrainedModel got _supports_cache_class = True support while working on #30536.

To avoid the same bug in the future, any model getting _supports_cache_class = True in the future should change the line:

if inputs_embeds is not None and past_key_values is None:
            model_inputs = {"inputs_embeds": inputs_embeds}

to

if inputs_embeds is not None and past_length == 0:
            model_inputs = {"inputs_embeds": inputs_embeds}

in prepare_inputs_for_generation() because checking past_key_values is None is no longer correct as empty but initialized DynamicCache can be passed.

Also to minimize code complexity, the if-else that checks if isinstance(past_key_values, Cache): (still in prepare_inputs_for_generation()) can be safely removed in favor of the case when this is True when _supports_cache_class = True as generate() will only use proper Cache classes in this case.
Some similar if-else could also be removed in upstream model architectures classes as well.

HuggingFaceDocBuilderDev · 2024-06-06T09:53:00Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts

Thanks for fixing, @Cyrilvallez!

Could you also remove the skips from the model tests which were added in #31266 as a patch?

I think the changes look OK, but let's have a second opinion from @ArthurZucker, as I'm not very familiar with the recent cache changes

ArthurZucker

we merged this for other models as well so LGTM.

* Fix jetmoe model * Remove skip-tests

Fix jetmoe model

3239258

amyeroberts approved these changes Jun 6, 2024

View reviewed changes

Remove skip-tests

5b6f758

ArthurZucker approved these changes Jun 7, 2024

View reviewed changes

ArthurZucker merged commit 8bcf9c8 into huggingface:main Jun 7, 2024
20 checks passed

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Jun 11, 2024

Fix jetmoe model (huggingface#31279)

73f745b

* Fix jetmoe model * Remove skip-tests

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Jun 14, 2024

Fix jetmoe model (huggingface#31279)

e9f0ab9

* Fix jetmoe model * Remove skip-tests

itazap pushed a commit that referenced this pull request Jun 17, 2024

Fix jetmoe model (#31279)

f440ef7

* Fix jetmoe model * Remove skip-tests

itazap pushed a commit that referenced this pull request Jun 17, 2024

Fix jetmoe model (#31279)

3b5190f

* Fix jetmoe model * Remove skip-tests

itazap pushed a commit that referenced this pull request Jun 17, 2024

Fix jetmoe model (#31279)

905d4d8

* Fix jetmoe model * Remove skip-tests

Cyrilvallez deleted the fix-moe branch August 30, 2024 08:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix jetmoe model #31279

Fix jetmoe model #31279

Cyrilvallez commented Jun 6, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Jun 6, 2024

amyeroberts left a comment

ArthurZucker left a comment

Fix jetmoe model #31279

Fix jetmoe model #31279

Conversation

Cyrilvallez commented Jun 6, 2024 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Jun 6, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

Cyrilvallez commented Jun 6, 2024 •

edited

Loading