-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The results seems to be very low when I re-run the code for 30 epochs? #6
Comments
Hi, may I know what kind of GPU you are using? The original work used Volta Quadro GP100 |
@xudi1227 Thanks! My GPU is RTX2080Ti with CUDA10.0+TF1.13 or GTX1060 with CUDA9.0+TF1.12. But both of them obtain the similar results (The mAP0.25 is about 4.26% and mAP0.50 is about 1.7%.). So I don't think that the different GPUs can have such a big gap in the performance of mAP. |
@happinesslz what is your Cudnn? I am using RTX2060 with Cuda 10.0 and TF1.13 and it does not work. with TF.1.14 it only runs on CPU and not GPU. |
@f3rhoodn My cudnn is cudnn-10.0-linux-x64-v7.4.2.24.tgz (The Version is v7.4.2). |
Hey, guys, I've updated my code so that it would run much faster now by caching training/testing data on CPU. Before this commit, all data must be read from the disk followed by some preprocessing on CPU side, which is actually a bottleneck. |
Hey,I met the same situation. I re-run the run.py file for training about 60epoch. The results seems to be very low. Have you solved this problem yet? Thank you very much. |
@happinesslz The total_cost are always nan during training. Did your meet the same problem? |
mAP0.250000: 0.042591
mAP0.500000: 0.01734
obj_accuracy: 0.86129
The results seems to be very low when I re-run the code for 30 epochs. It takes me about 5 days. Why?
The output of the log.log file:
[32m[0705 13:50:41 @monitor.py:467][0m lr: 0.001
[32m[0705 13:50:41 @monitor.py:467][0m mAP0.250000: 0.042591
[32m[0705 13:50:41 @monitor.py:467][0m mAP0.500000: 0.01734
[32m[0705 13:50:41 @monitor.py:467][0m obj_accuracy: 0.86129
[32m[0705 13:50:41 @monitor.py:467][0m param-summary/fp1/conv_0/W-rms: 0.24582
[32m[0705 13:50:41 @monitor.py:467][0m param-summary/fp1/conv_1/W-rms: 0.26173
[32m[0705 13:50:41 @monitor.py:467][0m param-summary/fp2/conv_0/W-rms: 0.25195
[32m[0705 13:50:41 @monitor.py:467][0m param-summary/fp2/conv_1/W-rms: 0.28531
[32m[0705 13:50:41 @monitor.py:467][0m param-summary/proposal/conv0/W-rms: 0.27876
........
.......
PeriodicTrigger-Evaluator: 2 hours 18 minutes 58 seconds
[32m[0705 13:50:41 @base.py:275][0m Start Epoch 30 ...
[32m[0705 14:20:26 @base.py:285][0m Epoch 30 (global_step 79260) finished, time:29 minutes 45 seconds.
[32m[0705 14:20:26 @saver.py:79][0m Model saved to train_log/run/model-79260.
The text was updated successfully, but these errors were encountered: