Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InvalidArgumentError (see above for traceback): LossTensor is inf or nan : Tensor had NaN values [[Node: train_op/CheckNumerics = CheckNumerics[T=DT_FLOAT, message="LossTensor is inf or nan", _device="/job:localhost/replica:0/task:0/device:CPU:0"](total_loss)]] #89

Open
metaStor opened this issue Mar 8, 2019 · 2 comments

Comments

@metaStor
Copy link

metaStor commented Mar 8, 2019

Environment: tensorflow-gpu 1.9.0 + cuda9.0

@ruyanyinian
Copy link

Environment: tensorflow-gpu 1.9.0 + cuda9.0

I think it has nothing to do with cpu/gpu, it has something to do with your dataset. If you run the first batch of dataset and "Tensor loss is Nan" appears, it indicates that your original dataset fluctuate dramatically which leads to pixel to be infinity, otherwise try to decrease your learning rate, and increase your batchsize

@metaStor
Copy link
Author

@ruyanyinian
I see. I'll give it a try. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants