SYCL backend support Multi-card #5282

NeoZhangJianyu · 2024-02-02T12:01:33Z

Discussed in #5277

^{Originally posted by airMeng February 2, 2024}
Feel free to drop a note, let's know if you have any feature request or bugs (even unconfirmed)

Multi-card Support
Multi-batch Support #5272
CI test error for more than one GPU is detected and used.
Current code returns all SYCL devices, including CPU, GPU (level-zero, opencl), FPGA. SYCL only support GPU. So when CI test on other devices, it will be fault.
Support no-mmap parameter in other application.
There is known issue of SYCL: memcpy() from host (mmap) to device will hang in same cases. It's not resolved now. A work around solution is no use mmap. I have handled it in llama-bench (add --mmap parameter). We need add to more applications in examples.
Clean code for warning and unused macro and variable.
Suggest to handle it after multiple-card is finished. Lots of such unused code will be useful for multiple-card feature.

Also let's know if you have taken any tasks here.

cc @NeoZhangJianyu @luoyu-intel @abhilash1910

NeoZhangJianyu · 2024-03-05T15:43:14Z

It's fixed by PR: #5806

Support multiple GPUs (split mode) on SYCL backend.
split mode: [none, layer] supported; [row] not supported, it's on developing.
Unify the GPU setting as Cublas backend:

support set main gpu by: --main-gpu
support detecting all GPUs with level-zero and same top Max compute units.
remove use GGML_SYCL_DEVICE to set main gpu.

format to show the device list, like:
found 6 SYCL devices:
|ID| Name |compute capability|Max compute units|Max work group|Max sub group|Global mem size|
|--|---------------------------------------------|------------------|-----------------|--------------|-------------|---------------|
| 0| Intel(R) Data Center GPU Flex 170| 1.3| 512| 1024| 32| 16225243136|
| 1| Intel(R) FPGA Emulation Device| 1.2| 64| 67108864| 64| 540713414656|
| 2| Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz| 3.0| 64| 8192| 64| 540713414656|
| 3| Intel(R) Data Center GPU Flex 170| 3.0| 512| 1024| 32| 16225243136|
| 4| Intel(R) Data Center GPU Flex 170| 3.0| 512| 1024| 32| 16225243136|
| 5| Intel(R) Data Center GPU Flex 170| 1.3| 512| 1024| 32| 16225243136|
detect 2 SYCL GPUs: [0,5] with Max compute units:512
Support OPs:
hardsigmoid
hardswish
pool2d
Use device index to set/get GPU data internal data.
same as cubals backend.
Use device ID to set/get GPU device info.

NeoZhangJianyu added the Intel GPU label Feb 2, 2024

NeoZhangJianyu self-assigned this Feb 2, 2024

NeoZhangJianyu closed this as completed Mar 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SYCL backend support Multi-card #5282

SYCL backend support Multi-card #5282

NeoZhangJianyu commented Feb 2, 2024

NeoZhangJianyu commented Mar 5, 2024

SYCL backend support Multi-card #5282

SYCL backend support Multi-card #5282

Comments

NeoZhangJianyu commented Feb 2, 2024

Discussed in #5277

NeoZhangJianyu commented Mar 5, 2024