Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advice needed: Adding more FSMT models #13160

Open
jvamvas opened this issue Aug 18, 2021 · 1 comment
Open

Advice needed: Adding more FSMT models #13160

jvamvas opened this issue Aug 18, 2021 · 1 comment
Assignees

Comments

@jvamvas
Copy link
Contributor

jvamvas commented Aug 18, 2021

🌟 New model addition

Model description

I am planning to contribute a series of FSMT models to the model hub. The models have been trained for a paper that is currently under review.

Before working on a PR I wanted to ask for some advice:

normalize_before

The new models have been trained with Fairseq's option normalize_before=True, while the existing FSMT implementation uses normalize_before=False. I understand that copy-pasting model code is preferred to extending the configuration. This would mean that a near-duplicate module fsmt_prenorm needs to be created. Is this correct?

Adequate base branch

The FSMT module is currently being refactored (#11218). Do you recommend that I start from the master branch or from the PR's feature branch, which is nearly completed?

@jvamvas
Copy link
Contributor Author

jvamvas commented Sep 17, 2021

@patil-suraj I am still very motivated to work on the pull request :) Just let me know if you need more information to answer my question.

In case you're interested, the paper describing our models is now public (https://openreview.net/forum?id=RvO9DqoWI9V). I believe the models could be of value to others in the community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants