Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Too many grads were not finite #1810

Open
SSwethaSel0609 opened this issue Nov 22, 2024 · 0 comments
Open

RuntimeError: Too many grads were not finite #1810

SSwethaSel0609 opened this issue Nov 22, 2024 · 0 comments

Comments

@SSwethaSel0609
Copy link

I'm trying to finetune the model using zipformer. I'm facing this issue
Traceback (most recent call last):
File "finetune.py", line 1532, in
main()
File "finetune.py", line 1525, in main
run(rank=0, world_size=1, args=args)
File "finetune.py", line 1403, in run
train_one_epoch(
File "finetune.py", line 1076, in train_one_epoch
scaler.step(optimizer)
File "/mnt/efs/swetha/ms_exp/icefall_env/lib/python3.8/site-packages/torch/cuda/amp/grad_scaler.py", line 313, in step
return optimizer.step(*args, **kwargs)
File "/mnt/efs/swetha/ms_exp/icefall_env/lib/python3.8/site-packages/torch/optim/optimizer.py", line 140, in wrapper
out = func(*args, **kwargs)
File "/mnt/efs/swetha/ms_exp/icefall_env/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/mnt/efs/swetha/ms_exp/icefall/egs/librispeech/ASR/zipformer/optim.py", line 345, in step
clipping_scale = self._get_clipping_scale(group, batches)
File "/mnt/efs/swetha/ms_exp/icefall/egs/librispeech/ASR/zipformer/optim.py", line 473, in _get_clipping_scale
raise RuntimeError("Too many grads were not finite")
RuntimeError: Too many grads were not finite

log_error.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant