Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SYCL backend support Multi-card #5282

Closed
5 tasks
NeoZhangJianyu opened this issue Feb 2, 2024 Discussed in #5277 · 1 comment
Closed
5 tasks

SYCL backend support Multi-card #5282

NeoZhangJianyu opened this issue Feb 2, 2024 Discussed in #5277 · 1 comment
Assignees

Comments

@NeoZhangJianyu
Copy link
Collaborator

Discussed in #5277

Originally posted by airMeng February 2, 2024
Feel free to drop a note, let's know if you have any feature request or bugs (even unconfirmed)

  • Multi-card Support
  • Multi-batch Support #5272
  • CI test error for more than one GPU is detected and used.
    Current code returns all SYCL devices, including CPU, GPU (level-zero, opencl), FPGA. SYCL only support GPU. So when CI test on other devices, it will be fault.
  • Support no-mmap parameter in other application.
    There is known issue of SYCL: memcpy() from host (mmap) to device will hang in same cases. It's not resolved now. A work around solution is no use mmap. I have handled it in llama-bench (add --mmap parameter). We need add to more applications in examples.
  • Clean code for warning and unused macro and variable.
    Suggest to handle it after multiple-card is finished. Lots of such unused code will be useful for multiple-card feature.

Also let's know if you have taken any tasks here.

cc @NeoZhangJianyu @luoyu-intel @abhilash1910

@NeoZhangJianyu
Copy link
Collaborator Author

It's fixed by PR: #5806

  1. Support multiple GPUs (split mode) on SYCL backend.
    split mode: [none, layer] supported; [row] not supported, it's on developing.

  2. Unify the GPU setting as Cublas backend:

support set main gpu by: --main-gpu
support detecting all GPUs with level-zero and same top Max compute units.
remove use GGML_SYCL_DEVICE to set main gpu.

  1. format to show the device list, like:
    found 6 SYCL devices:
    |ID| Name |compute capability|Max compute units|Max work group|Max sub group|Global mem size|
    |--|---------------------------------------------|------------------|-----------------|--------------|-------------|---------------|
    | 0| Intel(R) Data Center GPU Flex 170| 1.3| 512| 1024| 32| 16225243136|
    | 1| Intel(R) FPGA Emulation Device| 1.2| 64| 67108864| 64| 540713414656|
    | 2| Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz| 3.0| 64| 8192| 64| 540713414656|
    | 3| Intel(R) Data Center GPU Flex 170| 3.0| 512| 1024| 32| 16225243136|
    | 4| Intel(R) Data Center GPU Flex 170| 3.0| 512| 1024| 32| 16225243136|
    | 5| Intel(R) Data Center GPU Flex 170| 1.3| 512| 1024| 32| 16225243136|
    detect 2 SYCL GPUs: [0,5] with Max compute units:512

  2. Support OPs:
    hardsigmoid
    hardswish
    pool2d

  3. Use device index to set/get GPU data internal data.
    same as cubals backend.

  4. Use device ID to set/get GPU device info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant