-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example how to pretrain lm + introduction of config_name #57
base: master
Are you sure you want to change the base?
Conversation
So one can pretrain a language model from commandline The limit was added to support quick tests
@@ -280,7 +288,7 @@ def train_(self, dataset_or_path, tokenizer=None, **train_config): | |||
print("Language model saved to", self.experiment_path) | |||
|
|||
def validate(self): | |||
raise NotImplementedError("The validation on the language model is not implemented.") | |||
return "not implemented" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really just want to return a string here?
From command line: | ||
``` | ||
$ bash prepare_wiki.sh de | ||
$ python -W ignore -m multifit new multifit_paper_version replace_ --name my_lm - train_ --pretrain-dataset data/wiki/de-100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there's a superfluous space between -
and train-
. Why do we use train_
here? What is the difference between train_
and train
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Piotr, thanks for adding this. Looks good in general. I've added a few comments about minor things. In general, do you think it'd be possible to add a few short docstrings to explain things like bs
, bptt
, limit
in load_lm_databunch
for people not familiar with the library?
I've added ability to limit training set so we can use a test configuration 'multifit_mini_test` that executes in ~20 secs to test that the scripts are working.
Why config_name?
I've added it so we can know what training parameters we should load for the finetune-lm and and classifier. This parameters aren't stored along with a language model, only parameters used to build that model are saved.