This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Infinite loss value when training under amp #131

Open

jameslahm opened this issue Nov 3, 2022 · 1 comment

jameslahm commented Nov 3, 2022

Hi, I encounter the infinite loss value assertion failure when training using mixed precision.
The trackback like this:

Traceback (most recent call last):
  File "main.py", line 498, in <module>
    main(args)
  File "main.py", line 409, in main
    train_stats = train_one_epoch(
  File "ConvNeXt/engine.py", line 63, in train_one_epoch
    assert math.isfinite(loss_value)
AssertionError

I wonder how I could fix this problem. Thanks very much!

The text was updated successfully, but these errors were encountered:

uristern123 commented Apr 2, 2023

Hi,
This happened to me as well, did you find a solution to this problem?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.