Skip to content
This repository has been archived by the owner on Nov 21, 2022. It is now read-only.

language model load from checkpoint error #295

Closed
omerarshad opened this issue Oct 14, 2022 · 1 comment
Closed

language model load from checkpoint error #295

omerarshad opened this issue Oct 14, 2022 · 1 comment
Labels
bug / fix Something isn't working help wanted Extra attention is needed

Comments

@omerarshad
Copy link

omerarshad commented Oct 14, 2022

🐛 Bug

Saving aggregated checpoint for language modeling transformer gives error

RuntimeError: Error(s) in loading state_dict for LanguageModelingTransformer:
	Missing key(s) in state_dict: "model.lm_head.weight". 

To Reproduce

from pytorch_lightning.utilities.deepspeed import convert_zero_checkpoint_to_fp32_state_dict

convert_zero_checkpoint_to_fp32_state_dict(
    "./recreate_model/epoch=0-step=363.ckpt/",
    "./recreate_model/pytorch_model.bin"
 )

# Load best model from aggregated checkpoint file
best_model = LanguageModelingTransformer.load_from_checkpoint(
    "./recreate_model/pytorch_model.bin"
)
@omerarshad omerarshad added bug / fix Something isn't working help wanted Extra attention is needed labels Oct 14, 2022
@Borda
Copy link
Member

Borda commented Nov 21, 2022

could you please share the full trace? 🐰
seems to be a duplicate of #273 (comment) so lets keep only one 🦦

@Borda Borda closed this as completed Nov 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug / fix Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants