- auto-encoder + anomaly detection
- Ensemble
- Use domain knowledage
- Long tail classification
- metric learning
- self labeling
- Self-supervise learning
- Early Stop
- construct a binary classifier for (1) 11 & 13 classes and (2) others (V)
- construct a binary classifier for 11 and 13 classes (V)
- train the rest of the classes
-
CNNClassifier1113_trial_6 :::info resnet152 epoch 10 82.10 % 204.86 sec ::::
CNNClassifier1113_trial_7 :::info resnet152 epoch 15 87.04 % 314.89 sec ::::
CNNClassifier1113_trial_8 :::info resnet152 epoch 15 optim Adam 80.86 % 312.98 sec
::::CNNClassifier1113_trial_9 :::info resnet152 epoch 20 87.96 % 415.64 ::::
-
CNNClassifier1113_trial_1 :::info 1 epoch resnet18 optim SGD StepLR 89.51 % (low minor class accuracy) 20.16 sec :::
CNNClassifier1113_trial_2 :::info 8 epoch resnet18 optim SGD StepLR 80.56 % 141.13 sec :::
CNNClassifier1113_trial_4 :::info 8 epoch resnet18 optim SGD ReduceLROnPlateau 79.01 % :::
-
CNNClassifier_trial_17 (V) :::info 8 epoch resnet18 optim SGD StepLR 97.53 % 952.68 sec :::
-
using freeze-unfreeze (this method does not work very well in small model)
CNNClassifier_trial_18 freeze :::info 5 epoch resnet18 optim SGD StepLR 92.44 % 597.40 sec :::
CNNClassifier_trial_19 unfreeze :::info 8 epoch resnet18 optim SGD StepLR 97.45 % 949.98 sec :::
ratio of 11 & 13 vs other: 1:4 -> ~80 % baseline
-
Use kmean on feature maps, to seperate 11 & 13 from other. :::info 74.94 % :::
-
using CNN model
CNNClassifier_trial_3 :::info 1 epoch resnet18 85.51 % 120.92 sec :::
CNNClassifier_trial_4 :::info 4 epoch resnet18 93.03 % 471.69 sec :::
-
Using weightloss
CNNClassifier_trial_6 :::info 1 epoch resnet18 91.66 % 119.36 sec :::
CNNClassifier_trial_10 :::info 2 epoch resnet152 85.12 % 741.58 sec
:::CNNClassifier_trial_12 :::info 5 epoch resnet18 optim SGD 96.00 % 593.48 sec :::
add in model.train()
and model.eval()
in the train.py
use trial 68 as freeze
trial 69
unfreeze
:::info
epoch 20
91.34 %
2621.9 sec
:::
re-tain trial 64 (73.56 %) with fine-tuning
trial 68
freeze
:::info
epoch 6
--lr_name 'StepLR' --optim Adam
73.87 %
:::
re-train trial 65 (88.84 %) with fine-tuning trial 66 :::info epoch 6 --lr_name 'StepLR' --optim SGD 89.15 % 787.5 sec :::
trial 67
:::info
epoch 6
--lr_name 'StepLR' --optim Adam
89.89 %
790.0 sec
:::
-
using the cluster label
-
file
oneDfea_combine_True
is class 11 and 13 merge feature maps -
file
oneDfea_lab_combine_True.txt
is corresponded image names -
file
oneDfea_newlab_1113merge
is class 11 and 13 merge cluster new labels -
file
oneDfea_imgname_1113merge.txt
is corresponded image names -
file
oneDfea_train_label36
is the continuous labelstrial 64 freeze :::info epoch 10 73.56 % 1189.4 sec :::
trial 65 unfreeze :::info epoch 15 88.84 % 1949.2 sec :::
-
train freeze model (trial 56) more
trial 59 freeze :::info epoch 5 76.18 % overfiting at the last epoch :::
trial 60 freeze :::info StepLR epoch 5 76.18 % ::: train more have minor enhance
cluster image will improve the performace
- It is not an good idea using fc in the middle of auto-encoder (auto_encoder_trial_15)
- Changing the feature number in default auto-encoder model (trial 16 5 epochs)
- check the loss threshold of class 10 (cannot get with this model)
- Refine model: add batch normalization (trial 17)
- TODO: Postone for now
-
let class 11 and 13 to be the same class.
- use class label with range (0~13), original 11 & 13 merge to class 11.
- class 14 change to class 13
-
Discard cluster class
- model output change
- dataset label change
- train.py val loop change
trial 56 freeze :::info epoch 10 75.01 % 1177.3 sec :::
trial 57 unfreeze :::info epoch 15 89.31 % 1939.0 sec :::
trial 58 re-train trial 57 :::info StepLR, SGD 5 epoch 89.54 % :::
label 13 class is the worst in trial 47 try to add cluster method to label13 -> guess not, cluster seperation is very bad.
auto-encoder
- Default auto-encoder model (v)
-
Use se_resnext101_32x4d model trial 45 :::info epoch 1 59.42 % 690.6 sec :::
-
Using freeze-unfreeze trial 46 freeze :::info epoch 10 69.68 % 1182.7 sec epoch 7 of val loss shows the sign of overfitting :::
trial 47 unfreeze :::info epoch 15 90.21 % 1968.7 sec while lr 0.0001 performance increase :::
trial 48 unfreeze and re-train trial 47 with small lr and using SGD optim :::info epoch 5 90.80 % 659.4 sec :::
trial 49 unfreeze and re-train trial 47 with small lr (StepLR, step_size=2, gamma=0.1) and using SGD optim :::info epoch 5 90.80 % 651.5 sec :::
----------- Correct Accuracy Updated until Here-----------
- use the SGD+momentum re-run trial 43 trial 43 :::info epoch 5 82.8 % 635.6 sec the overfitting sign gone :::
- use the SGD+momentum re-run trial 43 trial 44 :::info epoch 10 81.32 % no improvement :::
!!!!!!!!!! IMPORTANT !!!!!!!!!!! YOU have not use AUG in train set (V)
trial 42
freeze
:::info
epoch 15, lr fixed
60 %
1770.4 sec
:::
trial 43 (file get washed @@)
unfreeze
:::info
epoch 20
82.6 %
sign of overfitting around epoch 13
:::
Using freeze-unfreeze method
trial 39
freeze
:::info
epoch 15
66.27 %
1670.6 sec
:::
trial 40
unfreeze and ReduceLROnPlateau with patience=4
:::info
epoch 20
81.0 %
2508.8 sec
sign of overfitting
:::
- cluster2target fixed
- val loss bug fixed
SGD -> Adam, default parameters trial 37 :::info epoch 5 67.02 % 406.9 sec :::
trial 38
:::info
epoch 15
75 %
1897.9 sec
at epoch 6, sign of overfitting
:::
k_numbers = [3,3,2,4,4,4,2,2,3,2,0,4,0,0,0]
- 37 subclasses :::info 3 epochs 26.87 % :::
- SGD -> Adam, default parameters :::info 3 epochs 27.50 % ::: :::info 15 epochs train loss is underfitting val loss is overfitting sth is wrong @@ check the label!! :::
- features from trial 23
- Feature separation is not good.
- kmean does not seperate the class well.
- features from imagenet (V)
- well seperate!
- kmean can have good seperations
oneDfea_train_metal_trained_False
is feature mapsoneDfea_lab_train_metal_trained_False
their corresponded labels
model_name change to se_resnet152 for default batch_size 16 as default
- Using the cluster dataset
- Directly use the 66 subclass :::info --epoch 10 -model_name 'se_resnet152' --batch_size 16 50.08 % :::
-
Using class distribution train/val/test set
-
Pretrain trial: Freeze :::info --epoch 15 --model_name 'se_resnet152' --batch_size 16 64.1 % 1645.0 sec :::
Unfreeze :::info --epoch 20 --model_name 'se_resnet152' --batch_size 16 81.36 % 3100.4 sec :::
- Pretrainmodel parameter freeze
-
freeze pretrain model parameter, using default lr :::info trial 18: lr = 0.01 --epoch 10 --model_name 'se_resnet152' --batch_size 16 59.07 % 1120.5 sec :::
-
Loading the previous model path, then unfreeze the model, and change the lr to smaller one. :::info trial 19: lr = 0.005 --epoch 10 --model_name 'se_resnet152' --batch_size 16 87.5 % 1258.5 sec :::
-
- CrossEntropy -> WeightFocalLoss :::info 'se_resnet50' batch_size 32 epoch 12 30.1 % :::
- StepLR(step_size=3, gamma=0.1)
- model_name 'se_resnet50' :::info batch_size 32 epoch 12 72.2 % 1374.3 sec :::
- StepLR -> ReduceLROnPlateau
- solving batch_size issue
- model_name 'efficientnet-b7' :::info batch_size = 5 epoch 1 12.14 % 246.9 sec :::
- StepLR -> ReduceLROnPlateau
- solving batch_size issue
- model_name 'se_resnet152' :::info batch_size = 16 epoch 10 77.67 % 1276.7 sec :::
Note that below is add up modifications
- optimizer: add momentum=0.9, nesterov=True :::info 1 epoch: 45.4 % :::
- learning rate: LambdaLR -> StepLR(step_size=1, gamma=0.1)
- optimizer: add weight_decay=0.01 :::info Trial 7 1 epoch: 42.1 % 2 epoch: 47.4 % 3011.4 sec :::
- rewrite dataset
- num_workers
- pin_memory :::info 1/8 time saved ::::
- learning rate: LambdaLR -> StepLR(step_size=1, gamma=0.1)
- optimizer: add weight_decay=0.01 :::info Trial 7 20 epoch: 48 % 2375.6 sec :::
In T2H folder image: T1H_Q1558-1090326175042092.jpg T1H_Q1558-1090326175049966.jpg T1H_Q1558-1090326175101066.jpg
I move them to T1H folder. :::info accuracy baseline: 56.7% (trial_3) :::
file is saved as trial_3