Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to train IMCTOD on Tiny-DOTA with a ResNet18 backbone or IMCTOD with any backbone on any other dataset. #12

Open
mburges-cvl opened this issue Jul 2, 2024 · 0 comments

Comments

@mburges-cvl
Copy link

Hi,

basically, I am able to train the standard IMCTOD model with a ResNet50 backbone on Tiny-Dota, but when I change the backbone to a ResNet18 it does not converge anymore, meaning rbbox_acc_userinput = 0.0000 and does not change. Additionally, if I try to train IMCTOD on any other dataset, like AITOD or SarDet-100K it either does not converge again or the loss explodes like here:


2024-07-02 13:02:28,763 - INFO - workflow: [('train', 1), ('val', 1)], max: 24 epochs
2024-07-02 13:03:03,498 - INFO - Epoch [1][50/11214]    lr: 0.00399, eta: 2 days, 3:31:01, time: 0.689, data_time: 0.124, memory: 2709, loss_rpn_cls: 1.3331, loss_rpn_bbox: 0.9505, r
bbox_loss_cls: 143.2558, rbbox_acc: 70.9961, rbbox_loss_bbox: 5.2747, rbbox_loss_userinput: 17.5812, rbbox_acc_userinput: 19.2424, loss: 168.3954
2024-07-02 13:03:32,597 - INFO - Epoch [1][100/11214]   lr: 0.00465, eta: 1 day, 23:29:58, time: 0.582, data_time: 0.038, memory: 2709, loss_rpn_cls: 0.3269, loss_rpn_bbox: 0.0377, r
bbox_loss_cls: 1.7075, rbbox_acc: 78.6953, rbbox_loss_bbox: 0.0394, rbbox_loss_userinput: 2.7982, rbbox_acc_userinput: 18.2667, loss: 4.9097
2024-07-02 13:04:02,379 - INFO - Epoch [1][150/11214]   lr: 0.00532, eta: 1 day, 22:29:43, time: 0.596, data_time: 0.048, memory: 2709, loss_rpn_cls: 0.1413, loss_rpn_bbox: 0.0271, r
bbox_loss_cls: 1.8033, rbbox_acc: 79.6914, rbbox_loss_bbox: 0.0146, rbbox_loss_userinput: 3.1509, rbbox_acc_userinput: 21.1282, loss: 5.1371
2024-07-02 13:04:32,127 - INFO - Epoch [1][200/11214]   lr: 0.00599, eta: 1 day, 21:58:36, time: 0.595, data_time: 0.040, memory: 2709, loss_rpn_cls: 0.2209, loss_rpn_bbox: 0.0359, r
bbox_loss_cls: 1.0497, rbbox_acc: 79.4766, rbbox_loss_bbox: 0.0130, rbbox_loss_userinput: 0.8200, rbbox_acc_userinput: 29.4737, loss: 2.1396
2024-07-02 13:05:08,506 - INFO - Epoch [1][250/11214]   lr: 0.00665, eta: 1 day, 23:38:34, time: 0.728, data_time: 0.126, memory: 2709, loss_rpn_cls: 0.3552, loss_rpn_bbox: 0.0822, r
bbox_loss_cls: 2.5401, rbbox_acc: 76.2852, rbbox_loss_bbox: 0.0130, rbbox_loss_userinput: 4.9793, rbbox_acc_userinput: 17.6944, loss: 7.9697
2024-07-02 13:05:41,096 - INFO - Epoch [1][300/11214]   lr: 0.00732, eta: 1 day, 23:48:26, time: 0.652, data_time: 0.070, memory: 2709, loss_rpn_cls: 0.2526, loss_rpn_bbox: 0.0529, r
bbox_loss_cls: 3.2947, rbbox_acc: 77.8984, rbbox_loss_bbox: 0.0234, rbbox_loss_userinput: 1.1427, rbbox_acc_userinput: 32.1300, loss: 4.7664
2024-07-02 13:06:16,800 - INFO - Epoch [1][350/11214]   lr: 0.00799, eta: 2 days, 0:35:11, time: 0.714, data_time: 0.084, memory: 2844, loss_rpn_cls: 0.3277, loss_rpn_bbox: 0.0708, r
bbox_loss_cls: 1.8398, rbbox_acc: 81.4570, rbbox_loss_bbox: 0.0351, rbbox_loss_userinput: 0.6952, rbbox_acc_userinput: 49.5912, loss: 2.9686
2024-07-02 13:06:49,340 - INFO - Epoch [1][400/11214]   lr: 0.00865, eta: 2 days, 0:34:40, time: 0.651, data_time: 0.076, memory: 2844, loss_rpn_cls: 0.4251, loss_rpn_bbox: 0.1549, r
bbox_loss_cls: 96.6881, rbbox_acc: 74.9141, rbbox_loss_bbox: 0.2146, rbbox_loss_userinput: 9.3517, rbbox_acc_userinput: 21.6746, loss: 106.8345
2024-07-02 13:07:22,805 - INFO - Epoch [1][450/11214]   lr: 0.00932, eta: 2 days, 0:43:21, time: 0.669, data_time: 0.074, memory: 2844, loss_rpn_cls: 2.7300, loss_rpn_bbox: 10.9257,
rbbox_loss_cls: 9052.8327, rbbox_acc: 63.7472, rbbox_loss_bbox: 875.6252, rbbox_loss_userinput: 148.5355, rbbox_acc_userinput: 19.5682, loss: 10090.6489
2024-07-02 13:07:55,148 - INFO - Epoch [1][500/11214]   lr: 0.00999, eta: 2 days, 0:40:08, time: 0.647, data_time: 0.053, memory: 2844, loss_rpn_cls: 36.9011, loss_rpn_bbox: 17.0154,
 rbbox_loss_cls: 2973.0151, rbbox_acc: 70.5039, rbbox_loss_bbox: 41.2835, rbbox_loss_userinput: 134.0229, rbbox_acc_userinput: 8.6407, loss: 3202.2379
2024-07-02 13:08:26,255 - INFO - Epoch [1][550/11214]   lr: 0.01000, eta: 2 days, 0:27:21, time: 0.622, data_time: 0.048, memory: 2844, loss_rpn_cls: 1.7885, loss_rpn_bbox: 0.1553, r
bbox_loss_cls: 615.6932, rbbox_acc: 59.4062, rbbox_loss_bbox: 2.6073, rbbox_loss_userinput: 18.7153, rbbox_acc_userinput: 16.8000, loss: 638.9595
2024-07-02 13:08:58,828 - INFO - Epoch [1][600/11214]   lr: 0.01000, eta: 2 days, 0:27:33, time: 0.651, data_time: 0.094, memory: 2844, loss_rpn_cls: 2.9955, loss_rpn_bbox: 1.8006, r
bbox_loss_cls: 10.8312, rbbox_acc: 66.9727, rbbox_loss_bbox: 0.6435, rbbox_loss_userinput: 8.8099, rbbox_acc_userinput: 28.3333, loss: 25.0808
2024-07-02 13:09:36,414 - INFO - Epoch [1][650/11214]   lr: 0.01000, eta: 2 days, 1:02:08, time: 0.752, data_time: 0.156, memory: 2847, loss_rpn_cls: 274.2519, loss_rpn_bbox: 1321.11
28, rbbox_loss_cls: 22386.9440, rbbox_acc: 69.0234, rbbox_loss_bbox: 469.6392, rbbox_loss_userinput: 17.5164, rbbox_acc_userinput: 14.8889, loss: 24469.4650
2024-07-02 13:10:09,988 - INFO - Epoch [1][700/11214]   lr: 0.01000, eta: 2 days, 1:06:03, time: 0.671, data_time: 0.081, memory: 2847, loss_rpn_cls: 8759579.1139, loss_rpn_bbox: 706
8485.2591, rbbox_loss_cls: 8792033.1225, rbbox_acc: 65.0234, rbbox_loss_bbox: 1370094.8281, rbbox_loss_userinput: 372.0313, rbbox_acc_userinput: 6.0000, loss: 25990565.5665
2024-07-02 13:10:42,825 - INFO - Epoch [1][750/11214]   lr: 0.01000, eta: 2 days, 1:04:59, time: 0.657, data_time: 0.078, memory: 2896, loss_rpn_cls: 514176.6596, loss_rpn_bbox: 1234
28.3289, rbbox_loss_cls: 7509783.6664, rbbox_acc: 82.6562, rbbox_loss_bbox: 4102617.4785, rbbox_loss_userinput: 31.4382, rbbox_acc_userinput: 0.0000, loss: 12250037.5878

Overall, except for the Tiny-DOTA + IMCTOD based on Faster RCNN + ResNet50 as the backbone, the model does not learn anything. Have you got an idea why that is the case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant