Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train_loss is NAN when use resnet-101 #33

Closed
ElbertFang opened this issue Mar 17, 2018 · 1 comment
Closed

train_loss is NAN when use resnet-101 #33

ElbertFang opened this issue Mar 17, 2018 · 1 comment

Comments

@ElbertFang
Copy link

Thanks for you great job at first.
I have success training the scst use restnet-152. But some problems happened when I use resnet-101. I save the features token by resnet-101 in another file and use it by absolute path and the train_loss became NAN all the time.
I print some data to find out the problems.
fc_feats, att_feats, labels, masks = tmp
print(fc_feats)
and I get this.
Variable containing:
inf inf inf ... 2.3545e-02 4.4336e-01 4.5093e-02
inf inf inf ... 2.3545e-02 4.4336e-01 4.5093e-02
inf inf inf ... 2.3545e-02 4.4336e-01 4.5093e-02
... ⋱ ...
9.7345e-01 1.5666e+00 1.1705e-01 ... 0.0000e+00 2.7455e-01 3.6576e-04
9.7345e-01 1.5666e+00 1.1705e-01 ... 0.0000e+00 2.7455e-01 3.6576e-04
9.7345e-01 1.5666e+00 1.1705e-01 ... 0.0000e+00 2.7455e-01 3.6576e-04
[torch.cuda.FloatTensor of size 50x2048 (GPU 0)]
Looks like I did't get the true fc feats. But I change nothing in prepro_labels.py and it works well when token the features of resnet-152.

@ruotianluo
Copy link
Owner

ruotianluo commented Dec 31, 2019

I have no idea. really weird. close it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants