-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing key(s) in state_dict when testing using predict_downstream_condition.py #17
Comments
Hi, Did you train the model with DDP? If so, the state dict keys may be different. |
i met this problem too.
sd2 keys is module.roberta.... while sd1 is roberta.... , it seems different. it runs when i change i also want to know how long you train on what device. i find that model often stuck in each eval_step for saving, which cost most time when i train the model |
same problem |
1 similar comment
same problem |
Thank you for your great work. I have the same problem. As explained in this github, I executed "run.sh" and it executed "DDP_main.py". However, I'm confused that @Hzfinfdu said, state dict keys of DDP may be different. Thank you.
|
I found that this was because the dictionary of the checkpoint saved after training had a few extra keys, so removing them was fine. model.load_state_dict(ckpt['model']) change to ckpt_model = ckpt['model']
new_ckpt = {}
for key, value in ckpt_model.items():
new_ckpt[key[7:]] = value
model.load_state_dict(new_ckpt) I don't know if this is correct, but the program does run correctly and outputs the results correctly |
Thank you for your reply. I solved the issue with your suggestion. or could you share the modified codes? It would be really helpful for me! |
python predict_downstream_condition.py --ckpt_path model_name_roberta-base_taskname_qqp_lr_3e-05_seed_42_numsteps_2000_sample_Categorical_schedule_mutual_hybridlambda_0.0003_wordfreqlambda_0.0_fromscratch_False_timestep_none_ckpts/best(38899).th
using standard schedule with num_steps: 2000.
Traceback (most recent call last):
File "predict_downstream_condition.py", line 101, in
model.load_state_dict(ckpt['model'])
File "/opt/conda/envs/diff/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1672, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for RobertaForMaskedLM:
Missing key(s) in state_dict: "roberta.embeddings.position_ids", "roberta.embeddings.word_embeddings.weight", "roberta.embeddings.position_embeddings.weight", "roberta.embeddings.token_type_embeddings.weight", "roberta.embeddings.LayerNorm.weight", "roberta.embeddings.LayerNorm.bias", "roberta.encoder.layer.0.attention.self.query.weight", "roberta.encoder.layer.0.attention.self.query.bias".........................
The text was updated successfully, but these errors were encountered: