-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSError: ./data/weneed/mask_r50/epoch_12.pth is not a checkpoint file #19
Comments
it seems this file './data/weneed/mask_r50/epoch_12.pth' is missing. |
Hello, I can't download this ./data/download_models/faster_rcnn_r50_fpn_2x_20181010-443129e1.pth file right now, I dare ask if you have downloaded it |
I followed the steps and tried to train with 2 GPUs and got this error.
./tools/dist_train.sh ./configs/bags/gs_mask_rcnn_r50_fpn_1x_lvis.py 2
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
[07/09 11:22:20] root WARNING: The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
Train fc_cls only.
--Dist-train--IS:False--ISout:False
Dist-train --- Not using image sampling.
Train fc_cls only.
--Dist-train--IS:False--ISout:False
Dist-train --- Not using image sampling.
[07/09 11:22:39] mmcv.runner.runner INFO: load checkpoint from ./data/weneed/mask_r50/epoch_12.pth
Traceback (most recent call last):
File "./tools/train.py", line 169, in
main()
File "./tools/train.py", line 165, in main
logger=logger)
File "/home/sy/Desktop/xlhuang/BalancedGroupSoftmax-master/mmdet/apis/train.py", line 58, in train_detector
_dist_train(model, dataset, cfg, validate=validate)
File "/home/sy/Desktop/xlhuang/BalancedGroupSoftmax-master/mmdet/apis/train.py", line 204, in _dist_train
runner.load_checkpoint(cfg.load_from)
File "/home/sy/anaconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/runner.py", line 234, in load_checkpoint
self.logger)
File "/home/sy/anaconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 171, in load_checkpoint
raise IOError('{} is not a checkpoint file'.format(filename))
OSError: ./data/weneed/mask_r50/epoch_12.pth is not a checkpoint file
[07/09 11:22:39] mmcv.runner.runner INFO: load checkpoint from ./data/weneed/mask_r50/epoch_12.pth
Traceback (most recent call last):
File "./tools/train.py", line 169, in
main()
File "./tools/train.py", line 165, in main
logger=logger)
File "/home/sy/Desktop/xlhuang/BalancedGroupSoftmax-master/mmdet/apis/train.py", line 58, in train_detector
_dist_train(model, dataset, cfg, validate=validate)
File "/home/sy/Desktop/xlhuang/BalancedGroupSoftmax-master/mmdet/apis/train.py", line 204, in _dist_train
runner.load_checkpoint(cfg.load_from)
File "/home/sy/anaconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/runner.py", line 234, in load_checkpoint
self.logger)
File "/home/sy/anaconda3/envs/mmdet/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 171, in load_checkpoint
raise IOError('{} is not a checkpoint file'.format(filename))
OSError: ./data/weneed/mask_r50/epoch_12.pth is not a checkpoint file
Traceback (most recent call last):
File "/home/sy/anaconda3/envs/mmdet/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/sy/anaconda3/envs/mmdet/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/sy/anaconda3/envs/mmdet/lib/python3.7/site-packages/torch/distributed/launch.py", line 253, in
main()
File "/home/sy/anaconda3/envs/mmdet/lib/python3.7/site-packages/torch/distributed/launch.py", line 249, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/sy/anaconda3/envs/mmdet/bin/python', '-u', './tools/train.py', '--local_rank=1', './configs/bags/gs_mask_rcnn_r50_fpn_1x_lvis.py', '--launcher', 'pytorch']' returned non-zero exit status 1.
The text was updated successfully, but these errors were encountered: