Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nezha Pytorch implementation #17776

Merged
merged 43 commits into from
Jun 23, 2022
Merged

Nezha Pytorch implementation #17776

merged 43 commits into from
Jun 23, 2022

Conversation

sijunhe
Copy link
Contributor

@sijunhe sijunhe commented Jun 19, 2022

What does this PR do?

This PR adds a pytorch implementation of the NEZHA model to transformers. NEZHA was introduced by Huawei Noah's Ark Lab in late 2019 and it is widely used in the Chinese NLP community. This implementation is based on the official pytorch implementation of NEZHA and the current BERT pytorch implementation . The model checkpoints are also from the official implementation.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Since the model is quite similar to bert, maybe @LysandreJik?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 19, 2022

The documentation is not available anymore as the PR was closed or merged.

@sijunhe sijunhe changed the title [WIP] Nezha Pytorch implementation Nezha Pytorch implementation Jun 21, 2022
@sijunhe
Copy link
Contributor Author

sijunhe commented Jun 21, 2022

ready for review! I'll upload the rest of the pre-trained models later today

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice addition! This PR is in great shape, I just have two comments.

  1. Remove the capital Z in all NeZhaXxx classes (so NezhaConfig, NezhaModel etc).
  2. Make sure the classes that are perfect duplicates of BERT classes have a # Copied from statement like this one for RoBERTa.

Also, if the model's default tokenizer is BertTokenizer, consider adding a mapping from the model config to the BERT tokenizers in tokenization_auto.py.

README.md Outdated Show resolved Hide resolved
src/transformers/__init__.py Outdated Show resolved Hide resolved
src/transformers/__init__.py Outdated Show resolved Hide resolved
Comment on lines 63 to 64
# [to be uploaded] "sijunhe/nezha-large-wwm",
# [to be uploaded] "sijunhe/nezha-cn-large",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flagging those so we don't forget to wait for them to be uploaded before merging :-)

src/transformers/models/nezha/modeling_nezha.py Outdated Show resolved Hide resolved
src/transformers/models/nezha/modeling_nezha.py Outdated Show resolved Hide resolved
src/transformers/models/nezha/modeling_nezha.py Outdated Show resolved Hide resolved
src/transformers/models/nezha/modeling_nezha.py Outdated Show resolved Hide resolved
src/transformers/models/nezha/modeling_nezha.py Outdated Show resolved Hide resolved
tests/models/nezha/test_modeling_nezha.py Outdated Show resolved Hide resolved
@sijunhe sijunhe requested a review from sgugger June 22, 2022 07:05
@sijunhe
Copy link
Contributor Author

sijunhe commented Jun 22, 2022

addressed all the comments from @sgugger and uploaded the two remaining models. Ready for a final round of review.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! The PR looks more or less ready to be merged already. Left some nits and it'd be nice if we try to add a maximum of # Copied from statements ... from BERT

src/transformers/models/nezha/modeling_nezha.py Outdated Show resolved Hide resolved
src/transformers/models/nezha/modeling_nezha.py Outdated Show resolved Hide resolved
@sgugger
Copy link
Collaborator

sgugger commented Jun 23, 2022

Thanks again for your contribution!

@sgugger sgugger merged commit 7cf52a4 into huggingface:main Jun 23, 2022
younesbelkada pushed a commit to younesbelkada/transformers that referenced this pull request Jun 25, 2022
* wip

* rebase

* all tests pass

* rebase

* ready for PR

* address comments

* fix styles

* add require_torch to pipeline test

* remove remote image to improve CI consistency

* address comments; fix tf/flax tests

* address comments; fix tf/flax tests

* fix tests; add alias

* repo consistency tests

* Update src/transformers/pipelines/visual_question_answering.py

Co-authored-by: NielsRogge <[email protected]>

* address comments

* Update src/transformers/pipelines/visual_question_answering.py

Co-authored-by: NielsRogge <[email protected]>

* merge

* wip

* wip

* wip

* most basic tests passes

* all tests pass now

* relative embedding

* wip

* running make fixup

* remove bert changes

* fix doc

* fix doc

* fix issues

* fix doc

* address comments

* fix CI

* remove redundant copied from

* address comments

* fix broken test

Co-authored-by: Sijun He <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
younesbelkada pushed a commit to younesbelkada/transformers that referenced this pull request Jun 29, 2022
* wip

* rebase

* all tests pass

* rebase

* ready for PR

* address comments

* fix styles

* add require_torch to pipeline test

* remove remote image to improve CI consistency

* address comments; fix tf/flax tests

* address comments; fix tf/flax tests

* fix tests; add alias

* repo consistency tests

* Update src/transformers/pipelines/visual_question_answering.py

Co-authored-by: NielsRogge <[email protected]>

* address comments

* Update src/transformers/pipelines/visual_question_answering.py

Co-authored-by: NielsRogge <[email protected]>

* merge

* wip

* wip

* wip

* most basic tests passes

* all tests pass now

* relative embedding

* wip

* running make fixup

* remove bert changes

* fix doc

* fix doc

* fix issues

* fix doc

* address comments

* fix CI

* remove redundant copied from

* address comments

* fix broken test

Co-authored-by: Sijun He <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants