Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Commit

Permalink
update perf. (#10761)
Browse files Browse the repository at this point in the history
  • Loading branch information
zheng-da authored and piiswrong committed May 2, 2018
1 parent 23934cf commit ebd8a6b
Showing 1 changed file with 119 additions and 112 deletions.
231 changes: 119 additions & 112 deletions docs/faq/perf.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,65 +29,70 @@ Note that _MXNet_ treats all CPUs on a single machine as a single device.
So whether you specify `cpu(0)` or `cpu()`, _MXNet_ will use all CPU cores on the machine.

### Scoring results
The following table shows performance,
The following table shows performance of [MXNet-1.2.0.rc1](https://github.com/apache/incubator-mxnet/releases/download/1.2.0.rc1/apache-mxnet-src-1.2.0.rc1-incubating.tar.gz),
namely number of images that can be predicted per second.
We used [example/image-classification/benchmark_score.py](https://github.com/dmlc/mxnet/blob/master/example/image-classification/benchmark_score.py)
to measure the performance on different AWS EC2 machines.

AWS EC2 C4.8xlarge:

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
| --- | --- | --- | --- | --- | --- | --- |
| 1 | 119.57 | 34.23 | 111.36 | 54.42 | 42.83 | 19.51 |
| 2 | 210.58 | 51.63 | 137.10 | 67.30 | 57.54 | 23.56 |
| 4 | 318.54 | 70.00 | 187.21 | 76.53 | 63.64 | 25.80 |
| 8 | 389.34 | 77.39 | 211.90 | 84.26 | 63.89 | 28.11 |
| 16 | 489.12 | 85.26 | 220.52 | 82.00 | 63.93 | 27.08 |
| 32 | 564.04 | 87.15 | 208.21 | 83.05 | 62.19 | 25.76 |

AWS EC2 C4.4xlarge:

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
| --- | --- | --- | --- | --- | --- | --- |
| 1 | 109.96 | 23.00 | 71.82 | 28.10 | 30.66 | 11.81 |
| 2 | 124.56 | 24.86 | 81.61 | 31.32 | 32.73 | 12.82 |
| 4 | 157.01 | 26.60 | 86.77 | 32.94 | 33.32 | 13.16 |
| 8 | 178.40 | 30.67 | 88.58 | 33.52 | 33.32 | 13.32 |
| 16 | 189.52 | 35.61 | 90.36 | 33.63 | 32.94 | 13.18 |
| 32 | 196.61 | 38.98 | 105.27 | 33.77 | 32.65 | 13.00 |

AWS EC2 C4.2xlarge:

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
| --- | --- | --- | --- | --- | --- | --- |
| 1 | 70.75 | 12.87 | 42.86 | 16.53 | 18.14 | 7.01 |
| 2 | 71.53 | 13.08 | 45.66 | 17.38 | 18.53 | 7.18 |
| 4 | 84.72 | 15.38 | 47.50 | 17.80 | 18.96 | 7.35 |
| 8 | 93.44 | 18.33 | 48.08 | 17.93 | 18.99 | 7.40 |
| 16 | 97.03 | 20.12 | 55.73 | 18.00 | 18.91 | 7.36 |
| 32 | 113.90 | 21.10 | 62.54 | 17.98 | 18.80 | 7.33 |

AWS EC2 C4.xlarge:

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
| --- | --- | --- | --- | --- | --- | --- |
| 1 | 37.92 | 6.57 | 23.09 | 8.79 | 9.65 | 3.73 |
| 2 | 36.77 | 7.31 | 24.00 | 9.00 | 9.84 | 3.78 |
| 4 | 43.18 | 8.94 | 24.42 | 9.12 | 9.91 | 3.83 |
| 8 | 47.05 | 10.01 | 28.32 | 9.13 | 9.88 | 3.83 |
| 16 | 55.74 | 10.61 | 31.96 | 9.14 | 9.86 | 3.80 |
| 32 | 65.05 | 10.91 | 33.86 | 9.34 | 10.31 | 3.86 |

AWS EC2 C4.large:

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
| --- | --- | --- | --- | --- | --- | --- |
| 1 | 19.86 | 3.67 | 12.20 | 4.59 | 5.11 | 1.97 |
| 2 | 19.37 | 4.24 | 12.41 | 4.64 | 5.15 | 1.98 |
| 4 | 22.64 | 4.89 | 14.34 | 4.66 | 5.16 | 2.00 |
| 8 | 27.19 | 5.25 | 16.17 | 4.66 | 5.16 | 1.99 |
| 16 | 31.82 | 5.46 | 17.24 | 4.76 | 5.35 | OOM |
| 32 | 34.67 | 5.55 | 17.64 | 4.88 | OOM | OOM |
AWS EC2 C5.18xlarge:

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
|-------|---------|--------|--------------|--------------|-----------|------------|
| 1 | 390.53 | 81.57 | 124.13 | 62.26 | 76.22 | 32.92 |
| 2 | 596.45 | 100.84 | 206.58 | 93.36 | 119.55 | 46.80 |
| 4 | 710.77 | 119.04 | 275.55 | 127.86 | 148.62 | 59.36 |
| 8 | 921.40 | 120.38 | 380.82 | 157.11 | 167.95 | 70.78 |
| 16 | 1018.43 | 115.30 | 411.67 | 168.71 | 178.54 | 75.13 |
| 32 | 1290.31 | 107.19 | 483.34 | 179.38 | 193.47 | 85.86 |


AWS EC2 C5.9xlarge:

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
|-------|---------|-------|--------------|--------------|-----------|------------|
| 1 | 257.77 | 50.61 | 130.99 | 66.95 | 75.38 | 32.33 |
| 2 | 410.60 | 63.02 | 195.14 | 87.84 | 102.67 | 41.57 |
| 4 | 462.59 | 62.64 | 263.15 | 109.87 | 127.15 | 50.69 |
| 8 | 573.79 | 63.95 | 309.99 | 121.36 | 140.84 | 59.01 |
| 16 | 709.47 | 67.79 | 350.19 | 128.26 | 147.41 | 64.15 |
| 32 | 831.46 | 69.58 | 354.91 | 129.92 | 149.18 | 64.25 |


AWS EC2 C5.4xlarge:

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
|-------|---------|-------|--------------|--------------|-----------|------------|
| 1 | 214.15 | 29.32 | 114.97 | 47.96 | 61.01 | 23.92 |
| 2 | 310.04 | 34.81 | 150.09 | 60.89 | 71.16 | 27.92 |
| 4 | 330.69 | 34.56 | 186.63 | 74.15 | 86.86 | 34.37 |
| 8 | 378.88 | 35.46 | 204.89 | 77.05 | 91.10 | 36.93 |
| 16 | 424.00 | 36.49 | 211.55 | 78.39 | 91.23 | 37.34 |
| 32 | 481.95 | 37.23 | 213.71 | 78.23 | 91.68 | 37.26 |


AWS EC2 C5.2xlarge:

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
|-------|---------|-------|--------------|--------------|-----------|------------|
| 1 | 131.01 | 15.67 | 78.75 | 31.12 | 37.30 | 14.75 |
| 2 | 182.29 | 18.01 | 98.59 | 39.13 | 45.98 | 17.84 |
| 4 | 189.31 | 18.25 | 110.26 | 41.35 | 49.21 | 19.32 |
| 8 | 211.75 | 18.57 | 115.46 | 42.53 | 49.98 | 19.81 |
| 16 | 236.06 | 19.11 | 117.18 | 42.59 | 50.20 | 19.92 |
| 32 | 261.13 | 19.46 | 116.20 | 42.72 | 49.95 | 19.80 |


AWS EC2 C5.xlarge:

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
|-------|---------|------|--------------|--------------|-----------|------------|
| 1 | 36.64 | 3.93 | 27.06 | 10.09 | 12.98 | 5.06 |
| 2 | 49.21 | 4.49 | 29.67 | 10.80 | 12.94 | 5.14 |
| 4 | 50.12 | 4.50 | 30.31 | 10.83 | 13.17 | 5.19 |
| 8 | 54.71 | 4.58 | 30.22 | 10.89 | 13.19 | 5.20 |
| 16 | 60.23 | 4.70 | 30.20 | 10.91 | 13.23 | 5.19 |
| 32 | 66.37 | 4.76 | 30.10 | 10.90 | 13.22 | 5.15 |


## Other CPU

Expand All @@ -101,88 +106,90 @@ We suggest always checking to make sure that a recent cuDNN version is used.

Setting the environment `export MXNET_CUDNN_AUTOTUNE_DEFAULT=1` sometimes also helps.

We show results when using various GPUs including K80 (EC2 p2.2xlarge), M40,
and P100 (DGX-1).
We show results when using various GPUs including K80 (EC2 p2.2xlarge), M60 (EC2 g3.4xlarge),
and V100 (EC2 p3.2xlarge).

### Scoring results

Based on
[example/image-classification/benchmark_score.py](https://github.com/dmlc/mxnet/blob/master/example/image-classification/benchmark_score.py)
and MXNet commit `0a03417`, with cuDNN 5.1
and [MXNet-1.2.0.rc1](https://github.com/apache/incubator-mxnet/releases/download/1.2.0.rc1/apache-mxnet-src-1.2.0.rc1-incubating.tar.gz), with cuDNN 7.0.5

- K80 (single GPU)

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
| --- | --- | --- | --- | --- | --- | --- |
| 1 | 202.66 | 70.76 | 74.91 | 42.61 | 70.94 | 24.87 |
| 2 | 233.76 | 63.53 | 119.60 | 60.09 | 92.28 | 34.23 |
| 4 | 367.91 | 78.16 | 164.41 | 72.30 | 116.68 | 44.76 |
| 8 | 624.14 | 119.06 | 195.24 | 79.62 | 129.37 | 50.96 |
| 16 | 1071.19 | 195.83 | 256.06 | 99.38 | 160.40 | 66.51 |
| 32 | 1443.90 | 228.96 | 287.93 | 106.43 | 167.12 | 69.73 |

- M40

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
| --- | --- | --- | --- | --- | --- | --- |
| 1 | 412.09 | 142.10 | 115.89 | 64.40 | 126.90 | 46.15 |
| 2 | 743.49 | 212.21 | 205.31 | 108.06 | 202.17 | 75.05 |
| 4 | 1155.43 | 280.92 | 335.69 | 161.59 | 266.53 | 106.83 |
| 8 | 1606.87 | 332.76 | 491.12 | 224.22 | 317.20 | 128.67 |
| 16 | 2070.97 | 400.10 | 618.25 | 251.87 | 335.62 | 134.60 |
| 32 | 2694.91 | 466.95 | 624.27 | 258.59 | 373.35 | 152.71 |

- P100

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
| --- | --- | --- | --- | --- | --- | --- |
| 1 | 624.84 | 294.6 | 139.82 | 80.17 | 162.27 | 58.99 |
| 2 | 1226.85 | 282.3 | 267.41 | 142.63 | 278.02 | 102.95 |
| 4 | 1934.97 | 399.3 | 463.38 | 225.56 | 423.63 | 168.91 |
| 8 | 2900.54 | 522.9 | 709.30 | 319.52 | 529.34 | 210.10 |
| 16 | 4063.70 | 755.3 | 949.22 | 444.65 | 647.43 | 270.07 |
| 32 | 4883.77 | 854.4 | 1197.74 | 493.72 | 713.17 | 294.17 |
| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
|-------|---------|--------|--------------|--------------|-----------|------------|
| 1 | 243.93 | 43.59 | 68.62 | 35.52 | 67.41 | 23.65 |
| 2 | 338.16 | 49.14 | 113.41 | 56.29 | 93.35 | 33.88 |
| 4 | 478.92 | 53.44 | 159.61 | 74.43 | 119.18 | 45.23 |
| 8 | 683.52 | 70.50 | 190.49 | 86.23 | 131.32 | 50.54 |
| 16 | 1004.66 | 109.01 | 254.20 | 105.70 | 155.40 | 62.55 |
| 32 | 1238.55 | 114.98 | 285.49 | 116.79 | 159.42 | 64.99 |

- M60

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
|-------|---------|--------|--------------|--------------|-----------|------------|
| 1 | 243.49 | 59.95 | 101.97 | 48.30 | 95.46 | 39.29 |
| 2 | 491.04 | 69.14 | 170.35 | 80.27 | 142.61 | 60.17 |
| 4 | 711.54 | 78.94 | 257.89 | 123.09 | 182.36 | 76.51 |
| 8 | 1077.73 | 109.34 | 343.42 | 152.82 | 208.74 | 87.27 |
| 16 | 1447.21 | 144.93 | 390.25 | 166.32 | 220.73 | 92.41 |
| 32 | 1797.66 | 151.86 | 416.69 | 176.56 | 230.19 | 97.03 |


- V100

| Batch | Alexnet | VGG | Inception-BN | Inception-v3 | Resnet 50 | Resnet 152 |
|-------|---------|--------|--------------|--------------|-----------|------------|
| 1 | 659.51 | 205.16 | 136.91 | 76.54 | 162.15 | 61.38 |
| 2 | 1248.21 | 265.40 | 261.85 | 144.23 | 293.74 | 116.30 |
| 4 | 2122.41 | 333.97 | 477.22 | 270.03 | 479.14 | 195.17 |
| 8 | 3894.30 | 420.26 | 831.09 | 450.68 | 699.39 | 294.19 |
| 16 | 5815.58 | 654.16 | 1332.26 | 658.97 | 947.45 | 398.79 |
| 32 | 7906.09 | 708.43 | 1784.23 | 817.33 | 1076.81 | 451.82 |


### Training results

Based on
[example/image-classification/train_imagenet.py](https://github.com/dmlc/mxnet/blob/master/example/image-classification/train_imagenet.py)
and MXNet commit `0a03417`, with CUDNN 5.1. The benchmark script is available at
and [MXNet-1.2.0.rc1](https://github.com/apache/incubator-mxnet/releases/download/1.2.0.rc1/apache-mxnet-src-1.2.0.rc1-incubating.tar.gz), with CUDNN 7.0.5. The benchmark script is available at
[here](https://github.com/mli/mxnet-benchmark/blob/master/run_vary_batch.sh),
where the batch size for Alexnet is increased by 8x.
where the batch size for Alexnet is increased by 16x.

- K80 (single GPU)

| Batch | Alexnet(\*8) | Inception-v3 | Resnet 50 |
| --- | --- | --- | --- |
| 1 | 230.69 | 9.81 | 13.83 |
| 2 | 348.10 | 15.31 | 21.85 |
| 4 | 457.28 | 20.48 | 29.58 |
| 8 | 533.51 | 24.47 | 36.83 |
| 16 | 582.36 | 28.46 | 43.60 |
| 32 | 483.37 | 29.62 | 45.52 |
| 1 | 300.30 | 10.48 | 15.61 |
| 2 | 406.08 | 16.00 | 23.88 |
| 4 | 461.01 | 22.10 | 32.26 |
| 8 | 484.00 | 26.80 | 39.42 |
| 16 | 490.45 | 31.62 | 46.69 |
| 32 | 414.72 | 33.78 | 49.48 |

- M40
- M60

| Batch | Alexnet(\*8) | Inception-v3 | Resnet 50 |
| Batch | Alexnet(\*16) | Inception-v3 | Resnet 50 |
| --- | --- | --- | --- |
| 1 | 405.17 | 14.35 | 21.56 |
| 2 | 606.32 | 23.96 | 36.48 |
| 4 | 792.66 | 37.38 | 52.96 |
| 8 | 1016.51 | 52.69 | 70.21 |
| 16 | 1105.18 | 62.35 | 83.13 |
| 32 | 1046.23 | 68.87 | 90.74 |
| 1 | 380.96 | 14.06 | 20.55 |
| 2 | 530.53 | 21.90 | 32.65 |
| 4 | 600.17 | 31.96 | 45.57 |
| 8 | 633.60 | 40.58 | 54.92 |
| 16 | 639.37 | 46.88 | 64.44 |
| 32 | 576.54 | 50.05 | 68.34 |

- P100
- V100

| Batch | Alexnet(\*8) | Inception-v3 | Resnet 50 |
| Batch | Alexnet(\*16) | Inception-v3 | Resnet 50 |
| --- | --- | --- | --- |
| 1 | 809.94 | 15.14 | 27.20 |
| 2 | 1202.93 | 30.34 | 49.55 |
| 4 | 1631.37 | 50.59 | 78.31 |
| 8 | 1882.74 | 77.75 | 122.45 |
| 16 | 2012.04 | 111.11 | 156.79 |
| 32 | 1869.69 | 129.98 | 181.53 |
| 1 | 1629.52 | 21.83 | 34.54 |
| 2 | 2359.73 | 40.11 | 65.01 |
| 4 | 2687.89 | 72.79 | 113.49 |
| 8 | 2919.02 | 118.43 | 174.81 |
| 16 | 2994.32 | 173.15 | 251.22 |
| 32 | 2585.61 | 214.48 | 298.51 |

## Multiple Devices

Expand Down

0 comments on commit ebd8a6b

Please sign in to comment.