-
Notifications
You must be signed in to change notification settings - Fork 27.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve encoder decoder model docs #17815
Improve encoder decoder model docs #17815
Conversation
The documentation is not available anymore as the PR was closed or merged. |
>>> model = EncoderDecoderModel(config=config) | ||
``` | ||
|
||
## Initialising [`EncoderDecoderModel`] from a pretrained encoder and a pretrained decoder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above you use initializing, here initialising.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for improving this! Would be great to also improve the docs of VisionEncoderDecoderModel and SpeechEncoderDecoderModel.
>>> # the forward function automatically creates the correct decoder_input_ids | ||
>>> loss = model(input_ids=input_ids, labels=labels).loss | ||
``` | ||
Detailed [colab](https://colab.research.google.com/drive/1WIk2bxglElfZewOHboPFNj8H44_VAyKE?usp=sharing#scrollTo=ZwQIEhKOrJpl) for training. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This notebook might already be outdated, cc @patrickvonplaten.
Also cc @ydshieh as we were planning on writing a blog post about them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NielsRogge should I remove the Colab link for now ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think ok to leave it for now :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the improvement, @Threepointone4 !
## Initialising [`EncoderDecoderModel`] from a pretrained encoder and a pretrained decoder. | ||
|
||
[`EncoderDecoderModel`] can be initialized from a pretrained encoder checkpoint and a pretrained decoder checkpoint. Note that any pretrained auto-encoding model, *e.g.* BERT, can serve as the encoder and both pretrained auto-encoding models, *e.g.* BERT, pretrained causal language models, *e.g.* GPT2, as well as the pretrained decoder part of sequence-to-sequence models, *e.g.* decoder of BART, can be used as the decoder. | ||
Depending on which architecture you choose as the decoder, the cross-attention layers might be randomly initialized. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be more careful to say the auto-encoding models that provide causal LM implementation.
Also the sentence is super long, it might be a good idea to split it.
... return_tensors="pt", | ||
... ).input_ids | ||
|
||
>>> labels = tokenizer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) it might be a good idea to add >>> # Let's use a summary of the above text as the target
Agree with the reviews of @ydshieh and @NielsRogge ! @Threepointone4 do you want to apply them ? Think we can merge after :-) |
Co-authored-by: NielsRogge <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
Co-authored-by: Yih-Dar <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
>>> # the forward function automatically creates the correct decoder_input_ids | ||
>>> loss = model(input_ids=input_ids, labels=labels).loss | ||
``` | ||
Detailed [colab](https://colab.research.google.com/drive/1WIk2bxglElfZewOHboPFNj8H44_VAyKE?usp=sharing#scrollTo=ZwQIEhKOrJpl) for training. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think ok to leave it for now :-)
Great job @Threepointone4 ! Merging :-) |
* Copied all the changes from the last PR * added in documentation_tests.txt * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: Yih-Dar <[email protected]> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <[email protected]> Co-authored-by: vishwaspai <[email protected]> Co-authored-by: NielsRogge <[email protected]> Co-authored-by: Yih-Dar <[email protected]>
* Copied all the changes from the last PR * added in documentation_tests.txt * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: Yih-Dar <[email protected]> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/model_doc/encoder-decoder.mdx Co-authored-by: NielsRogge <[email protected]> Co-authored-by: vishwaspai <[email protected]> Co-authored-by: NielsRogge <[email protected]> Co-authored-by: Yih-Dar <[email protected]>
What does this PR do?
This PR improves the documentation of encoder decoder model.
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
Issues link
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@patrickvonplaten