-
Notifications
You must be signed in to change notification settings - Fork 27.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nezha Pytorch implementation #17776
Nezha Pytorch implementation #17776
Conversation
…into vqa_pipeline
Co-authored-by: NielsRogge <[email protected]>
…into vqa_pipeline
Co-authored-by: NielsRogge <[email protected]>
The documentation is not available anymore as the PR was closed or merged. |
ready for review! I'll upload the rest of the pre-trained models later today |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice addition! This PR is in great shape, I just have two comments.
- Remove the capital Z in all
NeZhaXxx
classes (soNezhaConfig
,NezhaModel
etc). - Make sure the classes that are perfect duplicates of BERT classes have a
# Copied from
statement like this one for RoBERTa.
Also, if the model's default tokenizer is BertTokenizer
, consider adding a mapping from the model config to the BERT tokenizers in tokenization_auto.py
.
# [to be uploaded] "sijunhe/nezha-large-wwm", | ||
# [to be uploaded] "sijunhe/nezha-cn-large", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flagging those so we don't forget to wait for them to be uploaded before merging :-)
addressed all the comments from @sgugger and uploaded the two remaining models. Ready for a final round of review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! The PR looks more or less ready to be merged already. Left some nits and it'd be nice if we try to add a maximum of # Copied from statements ...
from BERT
Thanks again for your contribution! |
* wip * rebase * all tests pass * rebase * ready for PR * address comments * fix styles * add require_torch to pipeline test * remove remote image to improve CI consistency * address comments; fix tf/flax tests * address comments; fix tf/flax tests * fix tests; add alias * repo consistency tests * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <[email protected]> * address comments * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <[email protected]> * merge * wip * wip * wip * most basic tests passes * all tests pass now * relative embedding * wip * running make fixup * remove bert changes * fix doc * fix doc * fix issues * fix doc * address comments * fix CI * remove redundant copied from * address comments * fix broken test Co-authored-by: Sijun He <[email protected]> Co-authored-by: NielsRogge <[email protected]>
* wip * rebase * all tests pass * rebase * ready for PR * address comments * fix styles * add require_torch to pipeline test * remove remote image to improve CI consistency * address comments; fix tf/flax tests * address comments; fix tf/flax tests * fix tests; add alias * repo consistency tests * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <[email protected]> * address comments * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by: NielsRogge <[email protected]> * merge * wip * wip * wip * most basic tests passes * all tests pass now * relative embedding * wip * running make fixup * remove bert changes * fix doc * fix doc * fix issues * fix doc * address comments * fix CI * remove redundant copied from * address comments * fix broken test Co-authored-by: Sijun He <[email protected]> Co-authored-by: NielsRogge <[email protected]>
What does this PR do?
This PR adds a pytorch implementation of the NEZHA model to transformers. NEZHA was introduced by Huawei Noah's Ark Lab in late 2019 and it is widely used in the Chinese NLP community. This implementation is based on the official pytorch implementation of NEZHA and the current BERT pytorch implementation . The model checkpoints are also from the official implementation.
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
Since the model is quite similar to bert, maybe @LysandreJik?