Nan occurs in backward loss_otherwise #21

ChristophReich1996 · 2021-01-26T14:59:55Z

Hi, I encounter a weird nan error in general.py during training after multiple epochs.
Any idea why this error occurs or how to fix it?

Error message of torch.autograd.detect_anomaly().

Cheers and many thanks in advance
Christoph

The text was updated successfully, but these errors were encountered:

jonbarron · 2021-01-26T21:24:39Z

Hard to say without more info, but my guess at the most likely cause is 1) the input residual to the loss being extremely large (in which case clipping it should work) or NaN itself, or 2) alpha or scale becoming extremely large or small, in which case you probably want to manually constrain the range of values they take using the module interface.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nan occurs in backward loss_otherwise #21

Nan occurs in backward loss_otherwise #21

ChristophReich1996 commented Jan 26, 2021

jonbarron commented Jan 26, 2021

Nan occurs in backward loss_otherwise #21

Nan occurs in backward loss_otherwise #21

Comments

ChristophReich1996 commented Jan 26, 2021

jonbarron commented Jan 26, 2021