Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: OpenVINO will only use half of my available system threads on 2 socket configuration #27581

Open
3 tasks done
BlohoJo opened this issue Nov 16, 2024 · 16 comments
Open
3 tasks done
Assignees
Labels
bug Something isn't working support_request

Comments

@BlohoJo
Copy link

BlohoJo commented Nov 16, 2024

OpenVINO Version

2024.0.0 - Current

Operating System

Windows 10 Professional 2004 [Version 10.0.19041.1415]

Device used for inference

CPU (Intel Xeon E-2288G CPU [Coffee Lake / 9th Generation Core])

Framework

Any

Model used

Any

Issue description

This is related to my previous issue here:

#22678 (comment)

I'm an OpenVINO user of Topaz Video AI, and the current version uses only half of my total CPU. It uses OpenVINO 2024.3.0, but the problem I'm describing is specific to OpenVINO and started happening with 2024.0.0, and has history going back to at least 2023.0.1.

The system is a VM (VPS) running on a Hypervisor configured with two CPU sockets, each with 8 cores and 8 threads. The actual hardware and VM CPU is the Intel Xeon E-2288G CPU (Coffee Lake / 9th Generation Core).

The Intel Xeon E-2288G is listed as a supported processor for Windows 10 2004.

Intel Processor Diagnostic Tool v4.1.9.41 passes. TESTRESULTS.TXT contents: https://pastebin.com/zeWjbVur

CPU-Z Report: https://pastebin.com/uG071g8r

I used benchmark_app to track the history of when the problem started happening in OpenVINO. I beleive that the problem is that OpenVINO "thinks" that the total number of available threads on my system is only 8, when it is actually 16. I suspect it isn't currently able to interpret and handle more than 1 CPU socket.

Here is what I discovered:

In OpenVINO 2023.0.1, the system uses all available 16 threads (output shows INFERENCE_NUM_THREADS: 16) when the following command is used: benchmark_app -api sync -hint latency -t 10 -m "C:\OpenVINO\intel\face-detection-0206\FP16\face-detection-0206.xml". It hangs and does nothing if -api async is used.

In OpenVINO 2023.1.0 & 2023.2.0, OpenVINO doesn't work at all; it crashes (as noted in the above Github issue #22678).

In OpenVINO 2023.3.0, now both -api sync and -api async work, but, -api sync runs using only 8 threads (output shows INFERENCE_NUM_THREADS: 8; using -hint none -nthreads 16 doesn't work and output still shows INFERENCE_NUM_THREADS: 8). -api async works correctly and uses all 16 threads.

Starting with OpenVINO 2024.0.0, now both api-sync and api-async are only able to use 8 cores and there doesn't seem to be any way to get OpenVINO to use all 16 cores.

Step-by-step reproduction

To reproduce someone will need a Windows 10 system that has the Intel Xeon E-2288G CPU configured as two sockets with 8 cores and 8 threads per socket. Alternatively, it's possible this problem will manifest on a system configured with two CPU sockets and any similar Intel Xeon CPU models with X number of both cores and threads in each socket.

Install Python 3.9

Create a directory for virtual environment, i.e. C:\OpenVINO, and open a command prompt in this directory.

python -m venv openvino_env
openvino_env\Scripts\activate
python -m pip install --upgrade pip
python -m pip install openvino-dev==2023.3.0

omz_downloader --all
omz_converter --all

benchmark_app -api sync -hint latency -t 10 -m "C:\OpenVINO\intel\face-detection-0206\FP16\face-detection-0206.xml"

benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\intel\face-detection-0206\FP16\face-detection-0206.xml"

Relevant log output

(openvino_env) C:\OpenVINO\test>benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2023.3.0-13775-ceeafaf64f3-releases/2023/3
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2023.3.0-13775-ceeafaf64f3-releases/2023/3
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 109.38 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 1203.11 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 2
[ INFO ]   NUM_STREAMS: 2
[ INFO ]   AFFINITY: Affinity.NONE
[ INFO ]   INFERENCE_NUM_THREADS: 16
[ INFO ]   PERF_COUNT: NO
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   ENABLE_HYPER_THREADING: False
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[ INFO ]   CPU_DENORMALS_OPTIMIZATION: False
[ INFO ]   CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 2 inference requests, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 619.90 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            26 iterations
[ INFO ] Duration:         11093.38 ms
[ INFO ] Latency:
[ INFO ]    Median:        844.79 ms
[ INFO ]    Average:       853.04 ms
[ INFO ]    Min:           827.79 ms
[ INFO ]    Max:           945.26 ms
[ INFO ] Throughput:   2.34 FPS

(openvino_env) C:\OpenVINO\test>
(openvino_env) C:\OpenVINO\test>benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 109.35 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 937.45 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   INFERENCE_NUM_THREADS: 8
[ INFO ]   PERF_COUNT: NO
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   MODEL_DISTRIBUTION_POLICY: set()
[ INFO ]   ENABLE_HYPER_THREADING: False
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[ INFO ]   CPU_DENORMALS_OPTIMIZATION: False
[ INFO ]   LOG_LEVEL: Level.NO
[ INFO ]   CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 32
[ INFO ]   KV_CACHE_PRECISION: <Type: 'float16'>
[ INFO ]   AFFINITY: Affinity.NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 619.75 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            17 iterations
[ INFO ] Duration:         10680.20 ms
[ INFO ] Latency:
[ INFO ]    Median:        629.98 ms
[ INFO ]    Average:       628.07 ms
[ INFO ]    Min:           570.90 ms
[ INFO ]    Max:           702.74 ms
[ INFO ] Throughput:   1.59 FPS

(openvino_env) C:\OpenVINO\test>

Relevant log sections:

2023.3.0:

[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 2
[ INFO ]   NUM_STREAMS: 2
[ INFO ]   INFERENCE_NUM_THREADS: 16

2024.4.0:

[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   INFERENCE_NUM_THREADS: 8

(Additional lines):

[ INFO ]   MODEL_DISTRIBUTION_POLICY: set()
[ INFO ]   LOG_LEVEL: Level.NO
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 32
[ INFO ]   KV_CACHE_PRECISION: <Type: 'float16'>

Issue submission checklist

  • I'm reporting an issue. It's not a question.
  • I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
  • There is reproducer code and related data files such as images, videos, models, etc.
@BlohoJo BlohoJo added bug Something isn't working support_request labels Nov 16, 2024
@wangleis
Copy link
Contributor

hi @BlohoJo, The default latency behavior had changed, as shown in the logs:

  1. 2023.3 enabled 2 streams with 16 threads, 1 stream with 8 threads per socket.
  2. 2024.4 only enabled 1 stream with 8 threads on 1 socket.

If you want to use 1 stream with 16 in 2024.4, please try -hint none -nstreams 1 -nthreads 16.
If you want to use 2 stream with 16 in 2024.4 as 2023.3, please try -hint none -nstreams 2 -nthreads 16.

@BlohoJo
Copy link
Author

BlohoJo commented Nov 18, 2024

Is there any way to automate the generation of those variables on different systems, so that applications that use OpenVINO (like Topaz Video AI) can automatically use all available cores & threads on the system in which it happens to be running? 😕

@peterchen-intel
Copy link
Contributor

peterchen-intel commented Nov 20, 2024

@BlohoJo -hint throughput will use all the available CPU cores on system.
or -hint none -nstreams 1 -nthreads <NUM of CPU cores>

@BlohoJo
Copy link
Author

BlohoJo commented Nov 20, 2024

-hint throughput crashes Python on step 7 (model load) on my system (exception 0xc0000005, access violation).

Faulting application name: python.exe, version: 3.9.18150.1013, time stamp: 0x64f598e1
Faulting module name: openvino_intel_cpu_plugin.dll, version: 2024.4.0.16579, time stamp: 0x66d9c0bf
Exception code: 0xc0000005
Fault offset: 0x0000000000038d24
Faulting process id: 0x3130
Faulting application start time: 0x01db3b5cde0d4883
Faulting application path: C:\Program Files\Python39\python.exe
Faulting module path: C:\Program Files\Python39\lib\site-packages\openvino\libs\openvino_intel_cpu_plugin.dll
Report Id: 62be2ad1-47d2-4936-b694-5818b0d0a0b1
Faulting package full name: 
Faulting package-relative application ID: 

-hint none -nstreams 1 -nthreads 16 --> Only uses 8 cores.

-hint none -nstreams 2 -nthreads 16 --> Crashes Python on step 7 model load (exception 0xc0000094, divide by zero).

Faulting application name: python.exe, version: 3.9.18150.1013, time stamp: 0x64f598e1
Faulting module name: openvino_intel_cpu_plugin.dll, version: 2024.4.0.16579, time stamp: 0x66d9c0bf
Exception code: 0xc0000094
Fault offset: 0x0000000000037d03
Faulting process id: 0x28f8
Faulting application start time: 0x01db3b5db71a138d
Faulting application path: C:\Program Files\Python39\python.exe
Faulting module path: C:\Program Files\Python39\lib\site-packages\openvino\libs\openvino_intel_cpu_plugin.dll
Report Id: 37ae84e2-eec8-4795-8466-8f1ddd65ff39
Faulting package full name: 
Faulting package-relative application ID: 

-api sync and -api async don't make a difference for any of the above.

Dead in the water it seems for using all cores in OpenVINO 24.0.0 and above. 😞

@BlohoJo
Copy link
Author

BlohoJo commented Nov 20, 2024

I tried to get some additional info by running x64dbg (it sometimes shows something useful), but it won't work. The process terminates before benchmark_app can do anything.

Batch file:

@echo off
call C:\OpenVINO\Test\openvino_env\Scripts\activate.bat
start "" "C:\Program Files\x64dbg\x64\x64dbg.exe" -run -exe "C:\OpenVINO\Test\benchmark_app.exe" -arg "-api async -hint latency -t 10 -m C:\OpenVINO\Test\intel\face-detection-0206\FP16\face-detection-0206.xml"
xAnalyzer 2.5.6 Plugin by ThunderCls 2021
Extended analysis for static code
-> For latest release, issues, etc....
-> For help type command "xanal help"
-> code: http://github.com/ThunderCls/xAnalyzer
-> blog: http://reversec0de.wordpress.com

Initializing wait objects...
Initializing debugger...
Initializing debugger functions...
Setting JSON memory management functions...
Getting directory information...
Start file read thread...
Retrieving syscall indices...
Symbol Path: C:\Program Files\x64dbg\x64\symbols
Allocating message stack...
Initializing global script variables...
Registering debugger commands...
Registering GUI command handler...
Registering expression functions...
Registering format functions...
Registering Script DLL command handler...
Starting command loop...
Initialization successful!
Loading plugins...
[pluginload] xAnalyzer
Syscall indices loaded!
Error codes database loaded!
Exception codes database loaded!
NTSTATUS codes database loaded!
Windows constant database loaded!
Reading notes file...
File read thread finished!
[PLUGIN, xAnalyzer] Command "xanal" registered!
[PLUGIN, xAnalyzer] Command "xanalremove" registered!
[PLUGIN] xAnalyzer v2 Loaded!
Handling command line...
  "C:\Program Files\x64dbg\x64\x64dbg.exe"  -run -exe "C:\OpenVINO\Test\benchmark_app.exe" -arg "-api async -hint latency -t 10 -m C:\OpenVINO\Test\intel\face-detection-0206\FP16\face-detection-0206.xml"
Debugging: C:\OpenVINO\test\openvino_env\Scripts\benchmark_app.exe
Database file: C:\Program Files\x64dbg\x64\db\benchmark_app.exe.dd64
Loading commandline...
Loading database from C:\Program Files\x64dbg\x64\db\benchmark_app.exe.dd64 31ms
Process Started: [00007FF61A950000](x64dbg://localhost/address64#00007FF61A950000) C:\OpenVINO\test\openvino_env\Scripts\benchmark_app.exe
  "C:\OpenVINO\test\openvino_env\Scripts\benchmark_app.exe" -api async -hint latency -t 10 -m "C:\OpenVINO\Test\intel\face-detection-0206\FP16\face-detection-0206.xml"
  argv[0]: C:\OpenVINO\test\openvino_env\Scripts\benchmark_app.exe
  argv[1]: -api
  argv[2]: async
  argv[3]: -hint
  argv[4]: latency
  argv[5]: -t
  argv[6]: 10
  argv[7]: -m
  argv[8]: C:\OpenVINO\Test\intel\face-detection-0206\FP16\face-detection-0206.xml
Breakpoint at [00007FF61A95427C](x64dbg://localhost/address64#00007FF61A95427C) (entry breakpoint) set!
DLL Loaded: [00007FFFBC850000](x64dbg://localhost/address64#00007FFFBC850000) C:\Windows\System32\ntdll.dll
DLL Loaded: [00007FFFBBE30000](x64dbg://localhost/address64#00007FFFBBE30000) C:\Windows\System32\kernel32.dll
DLL Loaded: [00007FFFB9F60000](x64dbg://localhost/address64#00007FFFB9F60000) C:\Windows\System32\KernelBase.dll
DLL Loaded: [00007FFFB7020000](x64dbg://localhost/address64#00007FFFB7020000) C:\Windows\System32\apphelp.dll
DLL Loaded: [00007FFFBC250000](x64dbg://localhost/address64#00007FFFBC250000) C:\Windows\System32\shlwapi.dll
DLL Loaded: [00007FFFBBC20000](x64dbg://localhost/address64#00007FFFBBC20000) C:\Windows\System32\msvcrt.dll
Thread 8608 created, Entry: ntdll.[00007FFFBC8A2AD0](x64dbg://localhost/address64#00007FFFBC8A2AD0), Parameter: [0000000001138920](x64dbg://localhost/address64#0000000001138920)
Thread 8420 created, Entry: ntdll.[00007FFFBC8A2AD0](x64dbg://localhost/address64#00007FFFBC8A2AD0), Parameter: [0000000001138920](x64dbg://localhost/address64#0000000001138920)
System breakpoint reached!
[xAnalyzer]: Analysis retrieved from data base 
INT3 breakpoint "entry breakpoint" at <benchmark_app.OptionalHeader.AddressOfEntryPoint> ([00007FF61A95427C](x64dbg://localhost/address64#00007FF61A95427C))!
Thread 8420 exit
Thread 8608 exit
Process stopped with exit code 0x1 (1)
Saving database to C:\Program Files\x64dbg\x64\db\benchmark_app.exe.dd64 16ms
Debugging stopped!

@BlohoJo
Copy link
Author

BlohoJo commented Nov 20, 2024

The oldest OpenVINO version that will work with any of the model_zoo files is 2023.0.1. As mentioned above, that version works in Topaz Video AI with all 16 cores.

With benchmark_app, it crashes as above with -hint throughput.

With -api async, it hangs on step 10.

(openvino_env) C:\OpenVINO\test>benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2023.0.1-11005-fa1c41994f3-relases/2023/0
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2023.0.1-11005-fa1c41994f3-relases/2023/0
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 120.90 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 1093.97 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 0
[ INFO ]   NUM_STREAMS: 0
[ INFO ]   AFFINITY: Affinity.NONE
[ INFO ]   INFERENCE_NUM_THREADS: 0
[ INFO ]   PERF_COUNT: False
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: PerformanceMode.LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   ENABLE_HYPER_THREADING: True
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filed with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 0 inference requests, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
^C
(openvino_env) C:\OpenVINO\test>

With -api sync, it runs using all 16 cores:

(openvino_env) C:\OpenVINO\test>benchmark_app -api sync -hint latency -t 10 -m "
C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2023.0.1-11005-fa1c41994f3-releases/2023/0
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2023.0.1-11005-fa1c41994f3-releases/2023/0
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 115.96 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 1102.89 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 0
[ INFO ]   NUM_STREAMS: 0
[ INFO ]   AFFINITY: Affinity.NONE
[ INFO ]   INFERENCE_NUM_THREADS: 0
[ INFO ]   PERF_COUNT: False
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: PerformanceMode.LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   ENABLE_HYPER_THREADING: True
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference synchronously, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 452.58 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            22 iterations
[ INFO ] Duration:         10150.12 ms
[ INFO ] Latency:
[ INFO ]    Median:        450.61 ms
[ INFO ]    Average:       461.57 ms
[ INFO ]    Min:           445.08 ms
[ INFO ]    Max:           647.56 ms
[ INFO ] Throughput:   2.22 FPS

(openvino_env) C:\OpenVINO\test>

What's interesting is that in 2023.0.1, it shows 0 for INFERENCE_NUM_THREADS, NUM_STREAMS, and other parameters, both for -api sync and -api async.

I'm not sure if any of that info is helpful or not. With OpenVINO versions 2024.0.0 and above (including the new 2024.5.0), it's as if it only sees the CPU in socket #1.

@wangleis
Copy link
Contributor

@BlohoJo Could you run attached test_info.zip on your Windows platform and share log to us?

@BlohoJo
Copy link
Author

BlohoJo commented Nov 28, 2024

Thanks very much for the reply and the help! 😄

*********test data*******************

"0300000030000000000000000000000000000000000000000000000000000100ff0000000000000
00000000000000000000000003000000000000000000000000000000000000000000000000000010
00100000000000000000000000000000002000000380000000108400000800000020000000000000
000000000000000000000000000000000ff000000000000000000000000000000020000003800000
00108400000800000010000000000000000000000000000000000000000000000ff0000000000000
00000000000000000020000003800000002044000000004000000000000000000000000000000000
00000000000000000ff0000000000000000000000000000000200000038000000031040000000000
1000000000000000000000000000000000000000000000000ff00000000000000000000000000000
00000000030000000000000000000000000000000000000000000000000000100020000000000000
00000000000000000000000003000000000000000000000000000000000000000000000000000010
00400000000000000000000000000000000000000300000000000000000000000000000000000000
00000000000000100080000000000000000000000000000000000000030000000000000000000000
00000000000000000000000000000010010000000000000000000000000000000000000003000000
00000000000000000000000000000000000000000000001002000000000000000000000000000000
00000000030000000000000000000000000000000000000000000000000000100400000000000000
00000000000000000000000003000000000000000000000000000000000000000000000000000010
08000000000000000000000000000000003000000300000000000000000000000000000000000000
0000000000000010000ff00000000000000000000000000000000000030000000000000000000000
00000000000000000000000000000010000010000000000000000000000000000020000003800000
0010840000080000002000000000000000000000000000000000000000000000000ff00000000000
00000000000000000020000003800000001084000008000000100000000000000000000000000000
0000000000000000000ff00000000000000000000000000000200000038000000020440000000040
000000000000000000000000000000000000000000000000000ff000000000000000000000000000
00200000038000000031040000000000100000000000000000000000000000000000000000000000
000ff000000000000000000000000000000000000300000000000000000000000000000000000000
00000000000000100000200000000000000000000000000000000000030000000000000000000000
00000000000000000000000000000010000040000000000000000000000000000000000003000000
00000000000000000000000000000000000000000000001000008000000000000000000000000000
00000000030000000000000000000000000000000000000000000000000000100001000000000000
00000000000000000000000003000000000000000000000000000000000000000000000000000010
00020000000000000000000000000000000000000300000000000000000000000000000000000000
00000000000000100004000000000000000000000000000000000000030000000000000000000000
00000000000000000000000000000010000800000000000000000000000000000010000003000000
0000000000000000000000000000000000000000000000000ffff000000000000000000000000000
00400000050000000010001000000000000000000000000000000000000000000101000000000000
00000000000000000000000000000000000000000000000000000000000000000ffff00000000000
0"

*********test data*******************

(without line breaks)

*********test data*******************

"0300000030000000000000000000000000000000000000000000000000000100ff00000000000000000000000000000000000000300000000000000000000000000000000000000000000000000001000100000000000000000000000000000002000000380000000108400000800000020000000000000000000000000000000000000000000000ff00000000000000000000000000000002000000380000000108400000800000010000000000000000000000000000000000000000000000ff00000000000000000000000000000002000000380000000204400000000400000000000000000000000000000000000000000000000000ff00000000000000000000000000000002000000380000000310400000000001000000000000000000000000000000000000000000000000ff000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010002000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010004000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010008000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010010000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010020000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010040000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010080000000000000000000000000000000030000003000000000000000000000000000000000000000000000000000010000ff00000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100000100000000000000000000000000000200000038000000010840000080000002000000000000000000000000000000000000000000000000ff00000000000000000000000000000200000038000000010840000080000001000000000000000000000000000000000000000000000000ff00000000000000000000000000000200000038000000020440000000040000000000000000000000000000000000000000000000000000ff00000000000000000000000000000200000038000000031040000000000100000000000000000000000000000000000000000000000000ff00000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100000200000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100000400000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100000800000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100001000000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100002000000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100004000000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100008000000000000000000000000000000100000030000000000000000000000000000000000000000000000000000000ffff0000000000000000000000000000040000005000000001000100000000000000000000000000000000000000000010100000000000000000000000000000000000000000000000000000000000000000000000000000ffff000000000000"

*********test data*******************

@BlohoJo
Copy link
Author

BlohoJo commented Dec 3, 2024

Is the "Merging is blocked" status likely to change? 😟

(I'm not entirely familiar with how these things go on the OpenVINO repo so I apologize if the answer to this question is obvious.)

github-merge-queue bot pushed a commit that referenced this issue Dec 16, 2024

Verified

This commit was signed with the committer’s verified signature.
yijiasu-crypto Yijia Su
### Details:
- *support new windows platform which is a VM (VPS) running on a
Hypervisor*
 - *using one stream on two sockets*

### Tickets:
-
*[issues-27581](#27581
@wangleis
Copy link
Contributor

@BlohoJo PR has been merged. Could you please try master branch?

@BlohoJo
Copy link
Author

BlohoJo commented Dec 21, 2024

Sorry, things have gotten extremely busy and stressful for me lately! 🥴

I tried the new OpenVINO 2024.6.0. (Is that what I should be trying at this point? 😕)

Unfortunately, it didn't work.

It does no longer crash using -api async -hint latency, so that's a good change! 🙂

But, apart from that, the rest is still the same. It still only uses half of my CPU cores (one 8 core socket instead of both 8 core sockets) using either -api async -hint latency or -api sync -hint latency.

And, it still crashes using -hint throughput or -hint none -nstreams 2 -nthreads 16. It also still uses only half of my available 16 cores (again, 8 cores x 2 CPU sockets) using -hint none -nstreams 1 -nthreads 16.

I know it it possible for it to use all 16 cores because in OpenVINO 2023.0.1, it does use all 16 cores using -api async -hint latency.

Commands tried (below, output is in order listed):
benchmark_app -api sync -hint latency -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
Only uses 8 cores.

benchmark_app -api sync -hint throughput -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
benchmark_app -api async -hint throughput -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
Crashes Python on step 7 (model load) on my system (exception 0xc0000005, access violation).

benchmark_app -api sync -hint none -nstreams 1 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
benchmark_app -api async -hint none -nstreams 1 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
Only uses 8 cores

benchmark_app -api sync -hint none -nstreams 2 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
benchmark_app -api async -hint none -nstreams 2 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
Crashes Python on step 7 model load (exception 0xc0000094, divide by zero).

(openvino_env) C:\OpenVINO>benchmark_app -api sync -hint latency -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 114.06 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 952.27 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   INFERENCE_NUM_THREADS: 8
[ INFO ]   PERF_COUNT: NO
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   MODEL_DISTRIBUTION_POLICY: set()
[ INFO ]   ENABLE_HYPER_THREADING: False
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[ INFO ]   CPU_DENORMALS_OPTIMIZATION: False
[ INFO ]   LOG_LEVEL: Level.NO
[ INFO ]   CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 32
[ INFO ]   KV_CACHE_PRECISION: <Type: 'uint8_t'>
[ INFO ]   AFFINITY: Affinity.NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference synchronously, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 718.90 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            16 iterations
[ INFO ] Duration:         10021.07 ms
[ INFO ] Latency:
[ INFO ]    Median:        616.14 ms
[ INFO ]    Average:       626.29 ms
[ INFO ]    Min:           576.80 ms
[ INFO ]    Max:           718.93 ms
[ INFO ] Throughput:   1.60 FPS
(openvino_env) C:\OpenVINO>benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 100.07 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 923.88 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   INFERENCE_NUM_THREADS: 8
[ INFO ]   PERF_COUNT: NO
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   MODEL_DISTRIBUTION_POLICY: set()
[ INFO ]   ENABLE_HYPER_THREADING: False
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[ INFO ]   CPU_DENORMALS_OPTIMIZATION: False
[ INFO ]   LOG_LEVEL: Level.NO
[ INFO ]   CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 32
[ INFO ]   KV_CACHE_PRECISION: <Type: 'uint8_t'>
[ INFO ]   AFFINITY: Affinity.NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 617.26 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            18 iterations
[ INFO ] Duration:         11165.38 ms
[ INFO ] Latency:
[ INFO ]    Median:        620.57 ms
[ INFO ]    Average:       620.23 ms
[ INFO ]    Min:           535.83 ms
[ INFO ]    Max:           703.90 ms
[ INFO ] Throughput:   1.61 FPS
(openvino_env) C:\OpenVINO>benchmark_app -api sync -hint throughput -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 111.56 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device

------------------------------

Faulting application name: python.exe, version: 3.9.18150.1013, time stamp: 0x64f598e1
Faulting module name: openvino_intel_cpu_plugin.dll, version: 2024.6.0.17404, time stamp: 0x675afe5a
Exception code: 0xc0000005
Fault offset: 0x0000000000039411
Faulting process id: 0xb5c
Faulting application start time: 0x01db53b038bce5ec
Faulting application path: C:\Program Files\Python39\python.exe
Faulting module path: C:\OpenVINO\openvino_env\lib\site-packages\openvino\libs\openvino_intel_cpu_plugin.dll
Report Id: ad3f0c36-5b10-4367-b7aa-610f69f7cef3
Faulting package full name: 
Faulting package-relative application ID: 
(openvino_env) C:\OpenVINO>benchmark_app -api async -hint throughput -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 108.00 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device

------------------------------

Faulting application name: python.exe, version: 3.9.18150.1013, time stamp: 0x64f598e1
Faulting module name: openvino_intel_cpu_plugin.dll, version: 2024.6.0.17404, time stamp: 0x675afe5a
Exception code: 0xc0000005
Fault offset: 0x0000000000039411
Faulting process id: 0x2948
Faulting application start time: 0x01db53b07d00fa60
Faulting application path: C:\Program Files\Python39\python.exe
Faulting module path: C:\OpenVINO\openvino_env\lib\site-packages\openvino\libs\openvino_intel_cpu_plugin.dll
Report Id: ddec23e7-3b47-4aae-80f7-1a38f39041fa
Faulting package full name: 
Faulting package-relative application ID: 
(openvino_env) C:\OpenVINO>benchmark_app -api sync -hint none -nstreams 1 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 112.00 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 917.12 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   INFERENCE_NUM_THREADS: 8
[ INFO ]   PERF_COUNT: NO
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   MODEL_DISTRIBUTION_POLICY: set()
[ INFO ]   ENABLE_HYPER_THREADING: False
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[ INFO ]   CPU_DENORMALS_OPTIMIZATION: False
[ INFO ]   LOG_LEVEL: Level.NO
[ INFO ]   CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 32
[ INFO ]   KV_CACHE_PRECISION: <Type: 'uint8_t'>
[ INFO ]   AFFINITY: Affinity.NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference synchronously, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 630.01 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            17 iterations
[ INFO ] Duration:         10519.89 ms
[ INFO ] Latency:
[ INFO ]    Median:        607.83 ms
[ INFO ]    Average:       618.73 ms
[ INFO ]    Min:           559.84 ms
[ INFO ]    Max:           705.97 ms
[ INFO ] Throughput:   1.62 FPS
(openvino_env) C:\OpenVINO>benchmark_app -api async -hint none -nstreams 1 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 104.00 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 947.48 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   INFERENCE_NUM_THREADS: 8
[ INFO ]   PERF_COUNT: NO
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   MODEL_DISTRIBUTION_POLICY: set()
[ INFO ]   ENABLE_HYPER_THREADING: False
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[ INFO ]   CPU_DENORMALS_OPTIMIZATION: False
[ INFO ]   LOG_LEVEL: Level.NO
[ INFO ]   CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 32
[ INFO ]   KV_CACHE_PRECISION: <Type: 'uint8_t'>
[ INFO ]   AFFINITY: Affinity.NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests using 1 streams for CPU, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 648.89 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            17 iterations
[ INFO ] Duration:         10670.66 ms
[ INFO ] Latency:
[ INFO ]    Median:        612.50 ms
[ INFO ]    Average:       627.72 ms
[ INFO ]    Min:           591.16 ms
[ INFO ]    Max:           722.21 ms
[ INFO ] Throughput:   1.59 FPS
(openvino_env) C:\OpenVINO>benchmark_app -api sync -hint none -nstreams 2 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 105.03 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device

------------------------------

Faulting application name: python.exe, version: 3.9.18150.1013, time stamp: 0x64f598e1
Faulting module name: openvino_intel_cpu_plugin.dll, version: 2024.6.0.17404, time stamp: 0x675afe5a
Exception code: 0xc0000094
Fault offset: 0x0000000000038490
Faulting process id: 0x2d68
Faulting application start time: 0x01db53b0b4cbec91
Faulting application path: C:\Program Files\Python39\python.exe
Faulting module path: C:\OpenVINO\openvino_env\lib\site-packages\openvino\libs\openvino_intel_cpu_plugin.dll
Report Id: 1f3132bf-76c1-44c0-ae47-49626090c1aa
Faulting package full name: 
Faulting package-relative application ID: 
(openvino_env) C:\OpenVINO>benchmark_app -api async -hint none -nstreams 2 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 104.00 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device

------------------------------

Faulting application name: python.exe, version: 3.9.18150.1013, time stamp: 0x64f598e1
Faulting module name: openvino_intel_cpu_plugin.dll, version: 2024.6.0.17404, time stamp: 0x675afe5a
Exception code: 0xc0000094
Fault offset: 0x0000000000038490
Faulting process id: 0x2930
Faulting application start time: 0x01db53b0ec2ee46f
Faulting application path: C:\Program Files\Python39\python.exe
Faulting module path: C:\OpenVINO\openvino_env\lib\site-packages\openvino\libs\openvino_intel_cpu_plugin.dll
Report Id: b617c895-36e7-4b88-bbd6-1a667179f044
Faulting package full name: 
Faulting package-relative application ID: 

11happy pushed a commit to 11happy/openvino that referenced this issue Dec 23, 2024

Verified

This commit was signed with the committer’s verified signature.
yijiasu-crypto Yijia Su
### Details:
- *support new windows platform which is a VM (VPS) running on a
Hypervisor*
 - *using one stream on two sockets*

### Tickets:
-
*[issues-27581](openvinotoolkit#27581
@wangleis
Copy link
Contributor

@BlohoJo Please try master branch. The fix is not part of OpenVINO 2024.6.0.

@BlohoJo
Copy link
Author

BlohoJo commented Dec 24, 2024

I greatly apologize, but compiling OpenVINO is beyond my skill set and capability. I can get as far as opening Git Bash and running git clone https://github.com/openvinotoolkit/openvino.git in a directory, which just clones the master branch from GitHub. But compiling it means installing and configuring cmake, Microsoft Visual Studio 2019, Intel Graphics Drivers, etc.

If someone can build the master branch for me and link me to an archive (which has the contents of Lib\site-packages\openvino), I can definitely try it.

Unless there is a much more simple or automated command that will compile the master branch for me that I'm missing.

Again I apologize for my lack of knowledge and skill. 🙁

@wangleis
Copy link
Contributor

@BlohoJo
Copy link
Author

BlohoJo commented Dec 26, 2024

I'm really really sorry. I spent a couple of hours trying to get it to work, but to no avail. As you can tell, I'm totally lacking the knowledge and expertise to troubleshoot.

I can't install the nightly build you linked via pip.

(openvino_env) C:\OpenVINO>python -m pip install https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2025.0.0-17699-b0ff7090a30/w_openvino_toolkit_windows_2025.0.0.dev20241223_x86_64.zip
Collecting https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2025.0.0-17699-b0ff7090a30/w_openvino_toolkit_windows_2025.0.0.dev20241223_x86_64.zip
  Downloading https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2025.0.0-17699-b0ff7090a30/w_openvino_toolkit_windows_2025.0.0.dev20241223_x86_64.zip (117.7 MB)
     ------------------------------------- 117.7/117.7 MB 25.2 MB/s eta 0:00:00
ERROR: https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2025.0.0-17699-b0ff7090a30/w_openvino_toolkit_windows_2025.0.0.dev20241223_x86_64.zip does not appear to be a Python project: neither 'setup.py' nor 'pyproject.toml' found.

I tried downloading it manually, but it has a completely different directory structure than the ones I've been installing into the virtual environment using pip. I can build the libs folder by copying and pasting, but not anything else, and benchmark app just throws errors at everything I try, like this:

(openvino_env) C:\OpenVINO>benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
Traceback (most recent call last):
  File "C:\Program Files\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\OpenVINO\openvino_env\Scripts\benchmark_app.exe\__main__.py", line 4, in <module>
  File "C:\OpenVINO\openvino_env\lib\site-packages\openvino\tools\benchmark\main.py", line 8, in <module>
    from openvino.runtime import Dimension,properties
  File "C:\OpenVINO\openvino_env\lib\site-packages\openvino\runtime\__init__.py", line 8, in <module>
    from openvino._pyopenvino import get_version
ImportError: DLL load failed while importing _pyopenvino: The specified procedure could not be found.
(openvino_env) C:\OpenVINO>benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
Traceback (most recent call last):
  File "C:\Program Files\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\OpenVINO\openvino_env\Scripts\benchmark_app.exe\__main__.py", line 4, in <module>
  File "C:\OpenVINO\openvino_env\lib\site-packages\openvino\tools\benchmark\main.py", line 8, in <module>
    from openvino.runtime import Dimension,properties
  File "C:\OpenVINO\openvino_env\lib\site-packages\openvino\runtime\__init__.py", line 32, in <module>
    from openvino.runtime.ie_api import Core
  File "C:\OpenVINO\openvino_env\lib\site-packages\openvino\runtime\ie_api\__init__.py", line 5, in <module>
    from openvino._ov_api import Core
ModuleNotFoundError: No module named 'openvino._ov_api'

So, to summarize, my problem is:

  • I lack the expertise and knowledge to compile and build the master branch, then install and test it.
  • I lack the expertise and knowledge to install and test the nightly build contained in w_openvino_toolkit_windows_2025.0.0.dev20241223_x86_64.zip, as I can't install it into my virtual environment using pip, or manually (by copying and pasting).

Again I am very embarassed and very sorry for my lack of education here.

Is there some way I can get a package to test that I can install using pip?

@wangleis
Copy link
Contributor

@BlohoJo Please refer https://docs.openvino.ai/2024/get-started/install-openvino/install-openvino-archive-windows.html to install archive file of nightly build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working support_request
Projects
None yet
Development

No branches or pull requests

3 participants