-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use the resume function in training but not save new mAP values #5860
Comments
In my opinion, if you want to resume the training process, just fix the config file, at line |
okay , but i have one question, |
could you please write on the whole instruction you run for that in the linux system ?
or shall you write this resume_from in the config file ? |
I write this resume_from in the config file and run training syntax again: |
Many thanks, i will try it . |
Yah I did change the epoch. |
great , now i got it .. many thanks .. i will try it and see. |
@bommap2810 this is also the new log.json file , it starts from epoch 61 as i resumed the training from there till epoch 100 you may notice that there is a values of mAP from validation dataset present in the black screen and in json file , but it can not be plotted out .. which strange for me, what do you think? do you have any idea ? have a look also in testing results on one epoch.pth value (which is epoch 90) , it gives me zeros for all mAP values which is not sense. i am totally confused , any help please ? |
I got the same issue with ‘KeyError: 'work_dirs/yolox_tiny_8x8_300e_elevator/20210812_134209.log.json does not contain metric bbox_mAP'’. However, I'm pretty sure the mAP values are included in the json file. Does anyone solve this?😂 |
@Akazfu if any one have an idea/ideas , please share it with us. |
I have read the analyze_logs.py and noticed a metric check at line 53: |
I will try to check this bug and fix it. |
Dear All ,
I try to use the resume function in training mode , as i want to have more epoch than what i had ,
i did the training first using Faster_RCNN with only 60 epoch , then plot the performance and find the mAP values not enough , so i return back to the model and run this function with resume from latest epoch
python tools/train.py configs/human/my_custom_config.py --gpus 1 --work-dir training_data/faster_rcnn_epoch60 --resume-from training_data/faster_rcnn_epoch60/latest.pth
my problem and question is that , when using resume , i got more epoch saved in my machine , but without saving the value of mAP for validation for such epoch , and i need such value to evaluate the training process and choice the higher one to pick its weight for doing the test mode .
when try to plot the performance using this function with new (.json) file
python tools/analysis_tools/analyze_logs.py plot_curve training_data/faster_rcnn_epoch60/20210811_012551.log.json --keys bbox_mAP --legend mAP_bbox
I got this error massage:
KeyError: 'training_data/faster_rcnn_epoch60/20210811_012551.log.json does not contain metric bbox_mAP'
which is strange ,as i can plot the performance when using the json file of training in first time , but after resume i can not ..
although , i did not change any thing in config file except the max_epoch value to be 120 instead of 60
any help please?
The text was updated successfully, but these errors were encountered: