[Lazy init] Force fall back to slow init for composite models #11705

patrickvonplaten · 2021-05-12T14:20:22Z

What does this PR do?

Thanks to the great issue #11704 it was discovered that fast initialization currently breaks for all models whose XXXPreTrainedModel does not implement a _init_weights function and for which parts of the weights are missing when using .from_pretrained(...). This includes essentially all composite models, being Rag and EncoderDecoder.

This PR does the vanilla fix of forcing those models to fall back on _slow_init since a better fix requires a careful re-design which is left for a future PR.

Future PR

Remove hacky from_pretrained(...) methods in RAG and EncoderDecoder
Refactor the way "fast_init" calls model._init_weights for composite models. For Composite models, each part has to be called directly =>

model.encoder._init_weigths(all_missing_keys_of_encoder)
model.decoder._init_weigths(all_missing_keys_of_decoder)

Add more tests for RAG & EncoderDecoder

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

…into fix_init

LysandreJik

Eventually would be nice to add a test for this, as this is an issue that should have been caught by the tests!

Thanks for the hotfix.

src/transformers/models/encoder_decoder/modeling_encoder_decoder.py

src/transformers/models/rag/modeling_rag.py

sgugger

LGTM for a hotfix.

…er.py Co-authored-by: Lysandre Debut <[email protected]>

Co-authored-by: Lysandre Debut <[email protected]>

…gface#11705) * fix encoder-decoder & RAG * finalize * Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py Co-authored-by: Lysandre Debut <[email protected]> * Update src/transformers/models/rag/modeling_rag.py Co-authored-by: Lysandre Debut <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Lysandre Debut <[email protected]>

Patrick von Platen added 3 commits May 12, 2021 13:56

fix encoder-decoder & RAG

39f6aa1

Merge branch 'master' of https://github.com/huggingface/transformers …

2d461c5

…into fix_init

finalize

549b2e4

patrickvonplaten linked an issue May 12, 2021 that may be closed by this pull request

[RAG] official facebook example code for RAG is not working anymore. #11704

Closed

patrickvonplaten requested review from LysandreJik and sgugger May 12, 2021 14:28

LysandreJik approved these changes May 12, 2021

View reviewed changes

src/transformers/models/encoder_decoder/modeling_encoder_decoder.py Outdated Show resolved Hide resolved

src/transformers/models/rag/modeling_rag.py Outdated Show resolved Hide resolved

sgugger approved these changes May 12, 2021

View reviewed changes

patrickvonplaten and others added 2 commits May 12, 2021 15:34

Update src/transformers/models/encoder_decoder/modeling_encoder_decod…

a4b586c

…er.py Co-authored-by: Lysandre Debut <[email protected]>

Update src/transformers/models/rag/modeling_rag.py

6ac7f5f

Co-authored-by: Lysandre Debut <[email protected]>

LysandreJik merged commit fd6204b into huggingface:master May 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Lazy init] Force fall back to slow init for composite models #11705

[Lazy init] Force fall back to slow init for composite models #11705

patrickvonplaten commented May 12, 2021 •

edited

Loading

LysandreJik left a comment

sgugger left a comment

[Lazy init] Force fall back to slow init for composite models #11705

[Lazy init] Force fall back to slow init for composite models #11705

Conversation

patrickvonplaten commented May 12, 2021 • edited Loading

What does this PR do?

Future PR

Before submitting

Who can review?

LysandreJik left a comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

patrickvonplaten commented May 12, 2021 •

edited

Loading