You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
微调环境:
8 * Tesla V100
cuda 10.0.130
用的是文档中提供的docker镜像
f16脚本跑起来没问题,但是f32报错,信息如下:
Traceback (most recent call last):
File "finetune_chid.py", line 357, in
main()
File "finetune_chid.py", line 241, in main
model, optimizer, lr_scheduler = setup_model_and_optimizer(args)
File "/CPM/new/CPM-Finetune-main/utils.py", line 510, in setup_model_and_optimizer
args.iteration = load_checkpoint(model, optimizer, lr_scheduler, args)
File "/CPM/new/CPM-Finetune-main/utils.py", line 281, in load_checkpoint
checkpoint_name, sd = model.load_checkpoint(args.load, iteration, load_module_strict=False, load_optimizer_states=False, load_lr_scheduler_states=False)
File "/usr/local/lib/python3.6/dist-packages/deepspeed/runtime/engine.py", line 1196, in load_checkpoint
load_lr_scheduler_states=load_lr_scheduler_states)
File "/usr/local/lib/python3.6/dist-packages/deepspeed/runtime/engine.py", line 1231, in _load_checkpoint
self.optimizer.load_state_dict(checkpoint['optimizer'])
File "/usr/local/lib/python3.6/dist-packages/torch/optim/optimizer.py", line 108, in load_state_dict
saved_groups = state_dict['param_groups']
TypeError: 'NoneType' object is not subscriptable
The text was updated successfully, but these errors were encountered:
微调环境:
8 * Tesla V100
cuda 10.0.130
用的是文档中提供的docker镜像
f16脚本跑起来没问题,但是f32报错,信息如下:
Traceback (most recent call last):
File "finetune_chid.py", line 357, in
main()
File "finetune_chid.py", line 241, in main
model, optimizer, lr_scheduler = setup_model_and_optimizer(args)
File "/CPM/new/CPM-Finetune-main/utils.py", line 510, in setup_model_and_optimizer
args.iteration = load_checkpoint(model, optimizer, lr_scheduler, args)
File "/CPM/new/CPM-Finetune-main/utils.py", line 281, in load_checkpoint
checkpoint_name, sd = model.load_checkpoint(args.load, iteration, load_module_strict=False, load_optimizer_states=False, load_lr_scheduler_states=False)
File "/usr/local/lib/python3.6/dist-packages/deepspeed/runtime/engine.py", line 1196, in load_checkpoint
load_lr_scheduler_states=load_lr_scheduler_states)
File "/usr/local/lib/python3.6/dist-packages/deepspeed/runtime/engine.py", line 1231, in _load_checkpoint
self.optimizer.load_state_dict(checkpoint['optimizer'])
File "/usr/local/lib/python3.6/dist-packages/torch/optim/optimizer.py", line 108, in load_state_dict
saved_groups = state_dict['param_groups']
TypeError: 'NoneType' object is not subscriptable
The text was updated successfully, but these errors were encountered: