-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash at clGetDeviceIDs in OpenVINO 2022.3 #17854
Comments
@venki-thiyag please send detailed info about driver installed in system and results of hello_query_device.exe |
DxDiag (1).txt |
hello_query_device.exe.2232.zip
Crash location is same as the one that I reported. |
open_cl_crash.zip |
@vladimir-paramuzov @p-durandin |
Looks strange, do you have tested OpenVINO without modifications or on other machine? We have no meet similar problems on our TGL machines. |
Question: OpenCL version varies from one system to another, OpenVINO built with default OpenCL is expected to work on all these machines or OpenCL binaries needs to be distributed along with built OpenVINO binaries? |
I am unable to find clinfo in binaries built from OpenVINO source, from where can I get this tool? |
OpenVINO links against OpenCL ICD loader and it's ABI is stable:
OpenCL 2.0 standard is used, which is available everywhere.
So, you can use |
For Windows Platform where to get? |
I built OpenVINO 2022.3.0 from branch releases/2022/3 crash is observed when running hello_query_device.exe. I am trying to get the crash dump from test user. |
Backtrace is given above, pasting the same here:
|
Please note that it doesn't happen on all system, only on few systems this issue is seen. |
@venki-thiyag have you tried 2023.0 release? There we have improved device detection logic. |
@ilya-lavrenov Not yet tried, will try it ASAP. |
@ilya-lavrenov Issue is still seen with 2023.0 version, callstack is the following: ` OpenCL.dll!00007ff9e7f34658() Unknown
Crash dump and symbol files is attached at https://drive.google.com/file/d/1a1-a1XNcpF7NqCTg5bHb0SYESvmhiLUZ/view?usp=sharing |
clinfo can be built from sources (https://github.com/Oblomov/clinfo) for windows. Could you check if it works well in your setup? |
@vladimir-paramuzov I was able to compile clinfo from sources, following steps were followed:
Post this clinfo executable got generated. On my system clinfo app was working fine. But on users machine where crash was observed only the following output was seen:
This looks like a crash and most likely pointing to same callstack as previous one (have asked for callstack from customer). Any next steps? |
It means issue not with OpenVINO, but rather with installation / setup on your machine. |
@ilya-lavrenov if we use only CPU, then crash is not seen. Ideally OpenVINO GPU should report as not supported device and fallback to CPU mode, but should not crash. |
List of machine models where the problem is seen: Dell Vostro 5890 i5-10400 Will keep updating as we get more results from customers. |
Use AUTO device for this, it serves for such needs. |
ok, will try with AUTO device, but my suspecion is tha this too will fail, as AUTO logic too will try to get available devices and most likely cause crash. |
@songbell do we have a code inside auto that can catch SEGFAULT and fallback to CPU? |
segfault is out of scope, we can catch inference exceptions from gpu, but this crash looks like in early create plugin engine stage, so I suppose auto will not help in this case. |
@songbell any other suggestions here? Are we missing any additional check related to OpenCL GPU capability on system? |
|
@songbell @vladimir-paramuzov gentle ping. |
@vladimir-paramuzov Can GPU support "OpenVINO GPU should report as not supported device and fallback to CPU mode, but should not crash." in this scenario as @ilya-lavrenov mentioned? |
@peterchen-intel, @songbell, @avitial Looks like technically we can add a custom handler for signals, but it seems that it may be not a good idea, at least for linux:
For windows it seems that Structured exception handling api may allow to safely continue process after catching certain signal types, but there are also noncontinuable exceptions, not sure which one is signaled in this particular case. However I haven't learned this SEH win api deeply, so not sure if it has any side effects similar to posix signals. From my point of view, it's better to update drivers or manually disable GPU plugin (remove dll for example) on such systems rather than have a risk of unwanted side effects due to recovery attempt. But if someone knows a good and safe (and ideally portable) way to recover after segmentation fault, feel free to create PR, contributions are welcomed |
System information (version)
Detailed description
Crash callstack:
Crashing function inside OpenVINO, filename: opencl.hpp, line no: 2580
Attaching dxdiag of one the system where issue was observed, looks like issue is observed on all systems with OpenCL 2.2.8
DxDiag.txt
OpenVINO was custom compiled on 2022.3, this was done to use/consume from Electron application, there were some path issues which required some fixing.
Most likely issue seems to due to incompatible OpenCL version, where ion my system OpenCL version 3.0.3
The text was updated successfully, but these errors were encountered: