关于论文的问题 #2

Sharpiless · 2021-08-13T11:04:21Z

这里我有个问题：裁剪后的模型（剩余层）学习教师模型保留的参数（也是剩余层），无非两个一样的层之间多了个1×1卷积，应该没有区别的吗？

Sharpiless · 2021-08-13T11:40:31Z

在CIDAR100的复现结果是：

Origin number of parameters: 14761764
=> loading checkpoint 'logs/model_best.pth.tar'
=> loaded checkpoint 'logs/model_best.pth.tar' (epoch 130) Prec1: 0.598200
layer index: 2 	 total channel: 64 	 remaining channel: 32
layer index: 4 	 total channel: 64 	 remaining channel: 64
layer index: 7 	 total channel: 128 	 remaining channel: 128
layer index: 9 	 total channel: 128 	 remaining channel: 128
layer index: 12 	 total channel: 256 	 remaining channel: 256
layer index: 14 	 total channel: 256 	 remaining channel: 256
layer index: 16 	 total channel: 256 	 remaining channel: 256
layer index: 19 	 total channel: 512 	 remaining channel: 256
layer index: 21 	 total channel: 512 	 remaining channel: 256
layer index: 23 	 total channel: 512 	 remaining channel: 256
layer index: 26 	 total channel: 512 	 remaining channel: 256
layer index: 28 	 total channel: 512 	 remaining channel: 256
layer index: 30 	 total channel: 512 	 remaining channel: 256
Pre-processing Successful!
[32, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 256, 256, 256, 'M', 256, 256, 256]
New number of parameters: 5279684
Parameter pruning: 0.6423405766411114
In shape: 3, Out shape 32.
In shape: 32, Out shape 64.
In shape: 64, Out shape 128.
In shape: 128, Out shape 128.
In shape: 128, Out shape 256.
In shape: 256, Out shape 256.
In shape: 256, Out shape 256.
In shape: 256, Out shape 256.
In shape: 256, Out shape 256.
In shape: 256, Out shape 256.
In shape: 256, Out shape 256.
In shape: 256, Out shape 256.
In shape: 256, Out shape 256.
vggprune_pruning.py:275: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  data, target = Variable(data, volatile=True), Variable(target)

Test set: Accuracy: 966/10000 (9.660)

Origin model:

Test set: Accuracy: 5982/10000 (59.820)

Pruned model before recover:

Test set: Accuracy: 966/10000 (9.660)

Files already downloaded and verified
Reocver from layer 0 takes 1.4634578227996826s
Reocver from layer 1 takes 2.7621755599975586s
Reocver from layer 2 takes 1.6308107376098633s
Reocver from layer 3 takes 1.6396996974945068s
Reocver from layer 4 takes 0.8461010456085205s
Reocver from layer 5 takes 1.3865971565246582s
Reocver from layer 6 takes 0.8447220325469971s
Reocver from layer 7 takes 0.4805474281311035s
Reocver from layer 8 takes 0.4550657272338867s
Reocver from layer 9 takes 0.4173614978790283s
Reocver from layer 10 takes 0.37091565132141113s
Reocver from layer 11 takes 0.34157323837280273s
Reocver from layer 12 takes 0.3256824016571045s
Pruned model before absorb:

Test set: Accuracy: 598/10000 (5.980)

Total time: 15.025s
CPU time: 9.356s
Pruned model after absorb:

Test set: Accuracy: 598/10000 (5.980)

Sharpiless · 2021-08-13T11:44:10Z

运行的命令分别是：

CUDA_VISIBLE_DEVICES=0 python main.py --dataset cifar100 --arch vgg --depth 16 --lr 0.01

python vggprune_pruning.py --dataset cifar100 --depth 16 --model logs/model_best.pth.tar --save results --num_sample 500

Sharpiless · 2021-08-13T13:42:51Z

补充 VGG19 on CIFAR100 的结果：

Origin number of parameters: 20070180
=> loading checkpoint 'logs/model_best.pth.tar'
=> loaded checkpoint 'logs/model_best.pth.tar' (epoch 149) Prec1: 0.640100
layer index: 2 	 total channel: 64 	 remaining channel: 32
layer index: 4 	 total channel: 64 	 remaining channel: 64
layer index: 7 	 total channel: 128 	 remaining channel: 128
layer index: 9 	 total channel: 128 	 remaining channel: 128
layer index: 12 	 total channel: 256 	 remaining channel: 256
layer index: 14 	 total channel: 256 	 remaining channel: 256
layer index: 16 	 total channel: 256 	 remaining channel: 256
layer index: 18 	 total channel: 256 	 remaining channel: 128
layer index: 21 	 total channel: 512 	 remaining channel: 256
layer index: 23 	 total channel: 512 	 remaining channel: 256
layer index: 25 	 total channel: 512 	 remaining channel: 256
layer index: 27 	 total channel: 512 	 remaining channel: 256
layer index: 30 	 total channel: 512 	 remaining channel: 256
layer index: 32 	 total channel: 512 	 remaining channel: 512
layer index: 34 	 total channel: 512 	 remaining channel: 512
layer index: 36 	 total channel: 512 	 remaining channel: 512
Pre-processing Successful!
[32, 64, 'M', 128, 128, 'M', 256, 256, 256, 128, 'M', 256, 256, 256, 256, 'M', 256, 512, 512, 512]
New number of parameters: 10613700
Parameter pruning: 0.4711706621465278
In shape: 3, Out shape 32.
In shape: 32, Out shape 64.
In shape: 64, Out shape 128.
In shape: 128, Out shape 128.
In shape: 128, Out shape 256.
In shape: 256, Out shape 256.
In shape: 256, Out shape 256.
In shape: 256, Out shape 128.
In shape: 128, Out shape 256.
In shape: 256, Out shape 256.
In shape: 256, Out shape 256.
In shape: 256, Out shape 256.
In shape: 256, Out shape 256.
In shape: 256, Out shape 512.
In shape: 512, Out shape 512.
In shape: 512, Out shape 512.
vggprune_pruning.py:275: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  data, target = Variable(data, volatile=True), Variable(target)

Test set: Accuracy: 1034/10000 (10.340)

Origin model:

Test set: Accuracy: 6401/10000 (64.010)

Pruned model before recover:

Test set: Accuracy: 1034/10000 (10.340)

Files already downloaded and verified
Reocver from layer 0 takes 1.455902338027954s
Reocver from layer 1 takes 2.7735753059387207s
Reocver from layer 2 takes 1.6602439880371094s
Reocver from layer 3 takes 1.6129875183105469s
Reocver from layer 4 takes 0.83402419090271s
Reocver from layer 5 takes 0.872891902923584s
Reocver from layer 6 takes 0.8671615123748779s
Reocver from layer 7 takes 0.49571967124938965s
Reocver from layer 8 takes 0.4670424461364746s
Reocver from layer 9 takes 0.4791452884674072s
Reocver from layer 10 takes 0.48626708984375s
Reocver from layer 11 takes 0.4427921772003174s
Reocver from layer 12 takes 0.3528428077697754s
Reocver from layer 13 takes 0.5361449718475342s
Reocver from layer 14 takes 0.5488338470458984s
Reocver from layer 15 takes 0.732248067855835s
Pruned model before absorb:

Test set: Accuracy: 1863/10000 (18.630)

Total time: 16.725s
CPU time: 10.112s
Pruned model after absorb:

Test set: Accuracy: 1864/10000 (18.640)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

关于论文的问题 #2

关于论文的问题 #2

Sharpiless commented Aug 13, 2021

Sharpiless commented Aug 13, 2021 •

edited

Loading

Sharpiless commented Aug 13, 2021

Sharpiless commented Aug 13, 2021

关于论文的问题 #2

关于论文的问题 #2

Comments

Sharpiless commented Aug 13, 2021

Sharpiless commented Aug 13, 2021 • edited Loading

Sharpiless commented Aug 13, 2021

Sharpiless commented Aug 13, 2021

Sharpiless commented Aug 13, 2021 •

edited

Loading