Performance on imagenet100 and imagenet1k #1

cffan · 2020-01-29T19:57:30Z

Have you tried your implementation on imagenet100 dataset? I'm getting accuracy at around 69.0 with default config (8 gpu, lr 0.03, bs 256), which is lower than the MoCo implementation in the CMC repo.

bl0 · 2020-01-30T05:32:05Z

Hi, for imagenet100, we have got accuracy at 73+ (on par with that in CMC repo) with default config except for batch size = 128, which is the same with CMC repo.

cffan · 2020-01-30T05:36:58Z

Could you share the config you use to get 73+? Thanks!

bl0 · 2020-01-30T05:54:10Z

You can just modify the batch size per gpu to 128/ngpu such as 16 if you use 8 gpu.

bl0 · 2020-01-30T05:59:24Z

BTW, I also tried batch size = 256, the accuracy is 70.540, which is also lower than the MoCo implementation in the CMC repo.
But with the same config except for batch size = 128, the accuracy is 73+.

cffan · 2020-02-03T20:28:22Z

I tried batchsize 128 and got result around 72.3, which is better than the previous but still slightly worse than your results. Just want to make sure I got everything right. Here're my commands:

python -m torch.distributed.launch --nproc_per_node=8 \
    train.py \
    --batch-size 16 \
    --exp-name exp_name\
    --data-root data_folder

python -m torch.distributed.launch --nproc_per_node=4 \
    eval.py \
    --exp-name exp_name \
    --model-path output/exp_name/current.pth \
    --batch-size 64 \
    --data-root data_folder

And I'm running pytorch 1.4.0 and tochvision 0.5.0.

I think the author mentioned that using alpha=0.99 is slightly better than 0.999. Do you notice the same thing?

bl0 · 2020-02-04T04:06:52Z

Hi, sorry for the inconvenience. Eventually, I found the full config we used, which shows that we use alpha=0.99 instead of 0.999 as CMC author suggested.

Pre-training:

alpha=0.99, amp=False, aug='CJ', batch_size=32, beta1=0.5, beta2=0.999, crop=0.2, data_folder='./data/imagenet100', dataset='imagenet100', epochs=240, exp_name='MoCo/ddp/k_all-bs_128-all_shuffle_bn', learning_rate=0.03, local_rank=0, lr_decay_epochs=[120, 160, 200], lr_decay_rate=0.1, moco=True, model='resnet50', model_folder='./output/imagenet100/MoCo/ddp/k_all-bs_128-all_shuffle_bn//models', momentum=0.9, nce_k=16384, nce_m=0.5, nce_t=0.07, num_workers=4, opt_level='O2', print_freq=10, resume='', save_freq=10, softmax=True, start_epoch=1, tb_folder='./output/imagenet100/MoCo/ddp/k_all-bs_128-all_shuffle_bn//tensorboard', tb_freq=500, warm=False, weight_decay=0.0001

Finetuning:

adam=False, amp=False, aug='CJ', batch_size=256, beta1=0.5, beta2=0.999, bn=False, cosine=False, crop=0.2, data_folder='./data', dataset='imagenet100', epochs=60, exp_name='MoCo/ddp/k_all-bs_128-all_shuffle_bn', layer=6, learning_rate=10.0, lr_decay_epochs=[30, 40, 50], lr_decay_rate=0.2, model='resnet50', model_path='./output/imagenet100/MoCo/ddp/k_all-bs_128-all_shuffle_bn/models/current.pth', model_width=1, momentum=0.9, n_label=100, num_workers=24, opt_level='O2', print_freq=10, resume='', save_folder='./output/imagenet100/MoCo/ddp/k_all-bs_128-all_shuffle_bn//linear_models', save_freq=5, start_epoch=1, syncBN=False, tb_folder='./output/imagenet100/MoCo/ddp/k_all-bs_128-all_shuffle_bn//linear_tensorboard', tb_freq=500, warm=False, weight_decay=0

The figure of ins_loss and test_acc is also attached for reference.

cffan · 2020-02-05T00:28:44Z

Thanks, I'll try these configs.

Could you also share the configs to reproduce results on imagenet?

bl0 · 2020-02-05T05:23:57Z

Hi, I have updated the README and added the pre-trained model. You can get full configs in the checkpoints like this:

import torch
ckpt = torch.load('model.pth')
ckpt['opt']

BTW, the figure of ins_loss and test_acc for imagenet1k is also attached for reference.

cffan · 2020-02-05T05:32:59Z

Thanks! I reproduced 73+ results on imagenet100. Will try to run on imagenet1k.

bl0 · 2020-02-07T05:24:05Z

I will close this issue. If you have any questions, feel free to reopen it again.

bl0 · 2020-03-01T03:26:09Z

BTW, I got Acc@1 78.140% Acc@5 94.000% on imagenet100 with batch size 512, lr = 0.8, alpha = 0.99, K all.

cffan · 2020-03-01T06:04:59Z

Thanks for sharing. Does “K all” mean K=len(dataset)?

…

On Sat, Feb 29, 2020 at 7:26 PM 刘斌 ***@***.***> wrote: BTW, I got ***@***.*** 78.140% ***@***.*** 94.000% on imagenet100 with batch size 512, lr = 0.8, alpha = 0.99, K all. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1?email_source=notifications&email_token=AABNQNVPAEMV3SVVI6RB3YDRFHIVDA5CNFSM4KNKSRUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENMTDYY#issuecomment-593048035>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABNQNX3SO3ZRANAVYPTL3DRFHIVDANCNFSM4KNKSRUA> .

bl0 · 2020-03-01T07:01:07Z

Yes.

cffan · 2020-03-05T08:02:03Z

I tried your large batch size settings and get Acc@1 around 76. Do you change anything for evaluation? How many GPUs do you use?

bl0 · 2020-03-05T11:20:12Z

Hi, sorry for the misleading message. Actually, the result of 78+ is obtained in my internal version, which uses the cosine learning rate decay with 5 epochs of warmup.
I will release these modifications as soon as possible.

bl0 · 2020-03-06T12:49:22Z

Hi, I have updated the code and provide a script scripts/train_eval_imagenet100_baseLR0.4_alpha0.99_crop0.08_k1281166_t0.1_AMPO1.sh to reproduce the performance.

BTW, today I have merged a lot of updates from my internal version, such as warmup lr scheduler, add logger, support amp, etc.

bl0 · 2020-03-06T13:52:55Z

FYI, I have uploaded the checkpoint to onedrive which is pretrained on imagenet100 and achieve 78+.

cffan · 2020-03-06T22:36:45Z

Is this a typo? This seems to be the dataset size of the Imagenet1k.

bl0 · 2020-03-06T23:56:29Z

Fixed. Thanks for your help.

cffan · 2020-03-07T00:41:26Z

Have you tried similar large batch size settings on Imagenet1K?

bl0 · 2020-03-07T01:57:56Z

Actually, the key is the large base learning rate.
I use the large batch size just because I use 8 GPUs and I don't want the batch size per GPU too small, which may be inefficient. The consequence of the large batch size is linear learning rate scale and warmup.

For Imagenet1K, I also use the large batch size with a linear learning rate scale and warmup. But I have not tuned the base learning rate parameter.

bl0 · 2020-03-07T02:09:34Z

From my perspective, the imagenet100 is small, so the training is not sufficient. Then the large batch size and small alpha work well.
But for imagenet1k, the situation is different. So I don't think the large base learning rate could be much better than the default base learning rate.

bastian1209 · 2020-11-24T04:24:49Z

@bl0
Hi, I have a question about the learning rate for the linear evaluation phase. From the comments above and several materials, for MoCo implementation, lr=10.0 for ImageNet-100 and lr=30.0 for ImageNet-1K seems general baselines. Did the scale of CE loss for the linear evaluation training seem plausible? I mean, in my case, the observed CE loss is usually on the scale of 1e+2 ~ 1e+4 while accuracy keeps increasing stably. I would feel thankful for your reply!

bl0 added the good first issue Good for newcomers label Feb 5, 2020

bl0 changed the title ~~Performance on imagenet100~~ Performance on imagenet100 and imagenet1k Feb 5, 2020

bl0 closed this as completed Feb 7, 2020

bl0 pinned this issue Feb 16, 2020

bl0 unpinned this issue Mar 5, 2020

bl0 pinned this issue Mar 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance on imagenet100 and imagenet1k #1

Performance on imagenet100 and imagenet1k #1

cffan commented Jan 29, 2020

bl0 commented Jan 30, 2020

cffan commented Jan 30, 2020

bl0 commented Jan 30, 2020

bl0 commented Jan 30, 2020

cffan commented Feb 3, 2020 •

edited

Loading

bl0 commented Feb 4, 2020

cffan commented Feb 5, 2020

bl0 commented Feb 5, 2020

cffan commented Feb 5, 2020

bl0 commented Feb 7, 2020

bl0 commented Mar 1, 2020

cffan commented Mar 1, 2020 via email

bl0 commented Mar 1, 2020

cffan commented Mar 5, 2020

bl0 commented Mar 5, 2020

bl0 commented Mar 6, 2020

bl0 commented Mar 6, 2020

cffan commented Mar 6, 2020

bl0 commented Mar 6, 2020

cffan commented Mar 7, 2020

bl0 commented Mar 7, 2020 •

edited

Loading

bl0 commented Mar 7, 2020

bastian1209 commented Nov 24, 2020

Performance on imagenet100 and imagenet1k #1

Performance on imagenet100 and imagenet1k #1

Comments

cffan commented Jan 29, 2020

bl0 commented Jan 30, 2020

cffan commented Jan 30, 2020

bl0 commented Jan 30, 2020

bl0 commented Jan 30, 2020

cffan commented Feb 3, 2020 • edited Loading

bl0 commented Feb 4, 2020

cffan commented Feb 5, 2020

bl0 commented Feb 5, 2020

cffan commented Feb 5, 2020

bl0 commented Feb 7, 2020

bl0 commented Mar 1, 2020

cffan commented Mar 1, 2020 via email

bl0 commented Mar 1, 2020

cffan commented Mar 5, 2020

bl0 commented Mar 5, 2020

bl0 commented Mar 6, 2020

bl0 commented Mar 6, 2020

cffan commented Mar 6, 2020

bl0 commented Mar 6, 2020

cffan commented Mar 7, 2020

bl0 commented Mar 7, 2020 • edited Loading

bl0 commented Mar 7, 2020

bastian1209 commented Nov 24, 2020

cffan commented Feb 3, 2020 •

edited

Loading

bl0 commented Mar 7, 2020 •

edited

Loading