Multi-Device plugin automatically assigns inference requests to available computational devices to execute the requests in parallel.
In this lab, we are going to run the Benchmark App using the MULTI plugin to share the inference burden onto different hardware types.
First, let's run the Hello Query Device sample application, which queries available Inference Engine devices on your platform and prints out their metrics and default configuration values.
cd $HOME/inference_engine_samples_build/intel64/Release
./hello_query_device
With the NCS2 plugged on the development system used to create this tutorial, we see three different Inference Engine devices available: CPU, GPU, MYRIAD on the workshop laptop
Available devices:
Device: CPU
...
Device: GPU
...
Device: MYRIAD
...
Benchmark C++ Tool provides estimation of deep learning inference performance on the supported devices. Performance can be measured for two inference modes: synchronous (latency-oriented) and asynchronous (throughput-oriented).
Note The performance number mentioned below in this tutorial are obtained on a particular machine (Intel® Xeon® CPU E3-1268L v5 @ 2.4GHz x 8 + Intel® HD Graphics P530 (Skylake GT2) + 32 GB Memory).
cd $HOME/inference_engine_samples_build/intel64/Release
Note: Make sure you have gone through the Run the Verification Scripts to Verify Installation during your installation of OpenVINO Toolkit.
./benchmark_app \
-i /opt/intel/openvino_2021/deployment_tools/demo/car.png \
-m /home/intel/openvino_models/ir/intel/vehicle-attributes-recognition-barrier-0039/FP16/vehicle-attributes-recognition-barrier-0039.xml \
-d CPU \
-niter 1000
Performance of CPU on the development system:
Duration: 362.653 ms
Latency: 1.43271 ms
Throughput: 2757.46 FPS
./benchmark_app \
-i /opt/intel/openvino_2021/deployment_tools/demo/car.png \
-m /home/intel/openvino_models/ir/intel/vehicle-attributes-recognition-barrier-0039/FP16/vehicle-attributes-recognition-barrier-0039.xml \
-d GPU \
-niter 1000
Performance of GPU on the development system:
Duration: 666.11 ms
Latency: 2.60 ms
Throughput: 1501.24 FPS
./benchmark_app \
-i /opt/intel/openvino_2021/deployment_tools/demo/car.png \
-m /home/intel/openvino_models/ir/intel/vehicle-attributes-recognition-barrier-0039/FP16/vehicle-attributes-recognition-barrier-0039.xml \
-d MYRIAD \
-niter 1000
Performance of single NCS2 on the development system:
Duration: 1830.12 ms
Latency: 7.23 ms
Throughput: 546.41 FPS
./benchmark_app \
-i /opt/intel/openvino_2021/deployment_tools/demo/car.png \
-m /home/intel/openvino_models/ir/intel/vehicle-attributes-recognition-barrier-0039/FP16/vehicle-attributes-recognition-barrier-0039.xml \
-d MULTI:CPU,GPU \
-niter 1000
Performance of using both CPU and GPU on the development system, and it is better than using signle CPU or GPU:
Duration: 285.83 ms
Throughput: 3498.62 FPS
./benchmark_app \
-i /opt/intel/openvino_2021/deployment_tools/demo/car.png \
-m /home/intel/openvino_models/ir/intel/vehicle-attributes-recognition-barrier-0039/FP16/vehicle-attributes-recognition-barrier-0039.xml \
-d MULTI:CPU,MYRIAD \
-niter 1000
Performance of using both CPU and VPU on the development system, and it is better than using signle CPU or VPU:
Duration: 340.964 ms
Throughput: 2932.86 FPS
./benchmark_app \
-i /opt/intel/openvino_2021/deployment_tools/demo/car.png \
-m /home/intel/openvino_models/ir/intel/vehicle-attributes-recognition-barrier-0039/FP16/vehicle-attributes-recognition-barrier-0039.xml \
-d MULTI:GPU,MYRIAD \
-niter 1000
Performance of using both GPU and VPU on the development system,, and it is better than using signle GPU or VPU:
Duration: 377.77 ms
Throughput: 2647.13 FPS
./benchmark_app \
-i /opt/intel/openvino_2021/deployment_tools/demo/car.png \
-m /home/intel/openvino_models/ir/intel/vehicle-attributes-recognition-barrier-0039/FP16/vehicle-attributes-recognition-barrier-0039.xml \
-d MULTI:CPU,GPU,MYRIAD \
-niter 1000
Performance of using CPU, GPU and VPU on the development system, and it is better than using signle CPU, GPU or VPU:
Duration: 283.01 ms
Throughput: 3561.71 FPS
In this example, we plugged three NCS2 to the laptop
./hello_query_device
Then you will see multiple NCS2 from the prints (you might see different MYRIAD names than what you see from below list):
Available devices:
Device: CPU
...
Device: GPU
...
Device: MYRIAD.1.4.1-ma2480
...
Device: MYRIAD.1.4.2-ma2480
...
Device: MYRIAD.1.4.4-ma2480
...
./benchmark_app \
-i /opt/intel/openvino_2021/deployment_tools/demo/car.png \
-m /home/intel/openvino_models/ir/intel/vehicle-attributes-recognition-barrier-0039/FP16/vehicle-attributes-recognition-barrier-0039.xml \
-d MULTI:MYRIAD.1.4.1-ma2480,MYRIAD.1.4.2-ma2480,MYRIAD.1.4.4-ma2480 \
-niter 1000
Performance of using three NCS2 on the development system triples performance of using single NCS2:
Duration: 605.12 ms
Throughput: 1665.80 FPS
To learn more about Multi plugin, plase refer to Multi-Device Plugin session on OpenVINO documentation.