RuntimeError: Too many grads were not finite #1810

SSwethaSel0609 · 2024-11-22T08:38:27Z

I'm trying to finetune the model using zipformer. I'm facing this issue
Traceback (most recent call last):
File "finetune.py", line 1532, in
main()
File "finetune.py", line 1525, in main
run(rank=0, world_size=1, args=args)
File "finetune.py", line 1403, in run
train_one_epoch(
File "finetune.py", line 1076, in train_one_epoch
scaler.step(optimizer)
File "/mnt/efs/swetha/ms_exp/icefall_env/lib/python3.8/site-packages/torch/cuda/amp/grad_scaler.py", line 313, in step
return optimizer.step(*args, **kwargs)
File "/mnt/efs/swetha/ms_exp/icefall_env/lib/python3.8/site-packages/torch/optim/optimizer.py", line 140, in wrapper
out = func(*args, **kwargs)
File "/mnt/efs/swetha/ms_exp/icefall_env/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/mnt/efs/swetha/ms_exp/icefall/egs/librispeech/ASR/zipformer/optim.py", line 345, in step
clipping_scale = self._get_clipping_scale(group, batches)
File "/mnt/efs/swetha/ms_exp/icefall/egs/librispeech/ASR/zipformer/optim.py", line 473, in _get_clipping_scale
raise RuntimeError("Too many grads were not finite")
RuntimeError: Too many grads were not finite

log_error.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Too many grads were not finite #1810

RuntimeError: Too many grads were not finite #1810

SSwethaSel0609 commented Nov 22, 2024

RuntimeError: Too many grads were not finite #1810

RuntimeError: Too many grads were not finite #1810

Comments

SSwethaSel0609 commented Nov 22, 2024