-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-training code? #30
Comments
I would like to do the same but with bigbird-roberta-en-base if possible... |
If the hidden size is the same as roberta-base, you can probably use the weight generation script in the repo |
@logan-markewich, @jordanparker6 I am coding up the collator for the masking in the three pretraining strategies. Maybe we can work together, and share it here afterwards for everyone else to use? |
Happy to help out as needed. |
I don't think it is... I posted my error message on a seperate issue. I was able to use the provided script to create a lilt-roberta-base-en using the following: BigBird uses the same tokenizer as roberta so no issue with tokenization However, the following error occurs when loading the model.
I think this error is created when the pytorch state dicts are fused with the following line.
The lilt_model dim changes the incoming bigbird dim. Would it be problematic to switch this:
|
Are you able to provide the pre-training code?
I would like to try and pre-train using roberta-large, or a similar language model :)
The text was updated successfully, but these errors were encountered: