Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nano: trace to openvino inference engine need a thread(core) num control option #5608

Closed
TheaperDeng opened this issue Aug 31, 2022 · 4 comments · Fixed by #5705
Closed

Nano: trace to openvino inference engine need a thread(core) num control option #5608

TheaperDeng opened this issue Aug 31, 2022 · 4 comments · Fixed by #5705
Assignees
Labels

Comments

@TheaperDeng
Copy link
Contributor

For onnxruntime, users could control how many cores their accelerated model could use, while we don't have such option for openvino right now.

@rnwang04
Copy link
Contributor

rnwang04 commented Sep 1, 2022

Currently, no matter what the number of intra_op_num_threads and inter_op_num_threads is, only one core is used.
And more threads, inference is slower.

image
image

@jason-dai
Copy link
Contributor

Currently, no matter what the number of intra_op_num_threads and inter_op_num_threads is, only one core is used. And more threads, inference is slower.

image image

@qiyuangong Is this behavior expected?

@TheaperDeng
Copy link
Contributor Author

Currently, no matter what the number of intra_op_num_threads and inter_op_num_threads is, only one core is used. And more threads, inference is slower.
image image

@qiyuangong Is this behavior expected?

After huge amount of experiment and feedback from team members and customers I found

ONNXRuntime's core num control is completely fragile and useless, it may not respect user's input randomly

For fp32:

  • some hardware (e.g. i7-9700), the control is stable and follows the setting
  • some hardware (e.g. cpx), the control is useless, and the core usage might change randomly even on same SKU.

For int8

  • some hardware (e.g. i7-9700), no matter how many cores you set, it will only use 1 core
  • some hardware (e.g. icx), it will use 16 cores if you set < 16 cores, and respect your input if you set > 16 cores

We are contacting onnxruntime support team (MS and Intel) to find some solution.

@rnwang04
Copy link
Contributor

rnwang04 commented Sep 9, 2022

After unset KMP_AFFINITY, onnxruntimie works fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants