sycl : try to fix SYCL after IQ1_S changes #5995

ggerganov · 2024-03-11T11:32:24Z

No description provided.

ggerganov · 2024-03-11T13:25:46Z

Could anyone with SYCL hardware run test-backend-ops and check if this implementation works correctly?

abhilash1910 · 2024-03-11T13:54:08Z

Thanks @ggerganov for adding the new changes , there are some changes needed on it. I will apply once i am near my laptop.

ggml-sycl.cpp

ggerganov · 2024-03-12T08:10:06Z

ggml-sycl.cpp

+    const float delta = x[i].qh[ib] & 0x8000 ? -1 - IQ1S_DELTA : -1 + IQ1S_DELTA;
+    const float d = (float)x[i].d * (2*((x[i].qh[ib] >> 12) & 7) + 1);
+    uint32_t grid32[2]; const int8_t * q = (const int8_t *)grid32;
+    grid32[0] = iq1s_grid_gpu[x[i].qs[4*ib+il] | (((x[i].qh[ib] >> 3*il) & 7) << 8)];


AFAICT, we can't use iq1s_grid_gpu constant table directly - that is why we pass the iq1s_grid pointer to this kernel.

Btw, there have been more changes to IQ1_S (#5999) since I opened this PR that should also be reflected. But I don't have means to do tests, so implementing this blindly is difficult

My suggestion is to push empty IQ1_S kernels on master so that the CI becomes green and your team can work on implementing the kernels and verifying that test-backend-ops runs correctly

Agreed , let me try to build a solution. Thanks for the pointer. In the meantime, we can revert the initial solution using grid1 & 2

* sycl : try to fix after IQ1_S changes * sycl : iq1s_grid -> iq1s_grid_gpu * sycl : fix grid type

ikawrakow

Please remove this incorrect implementation of IQ1_S

ikawrakow · 2024-03-28T11:22:02Z

ggml-sycl.cpp

-    const int8_t * grid = (const int8_t *)(iq1s_grid + (x[i].qs[i8] | ((h & 8) << 5)));
-    const float d = (float)x[i].d * (2*(h & 7) + 1);
-    for (int j = 0; j < 8; ++j) y[j] = d * grid[j];
+    const uint8_t  * qs = x[i].qs + 8*ib;


This is wrong. There are no signs in IQ1_S and have never been. This bogus implementation has been sitting on the master branch for 2 weeks now. PR #6014 that actually fixes it, has been sitting unreviewed for 2 weeks.

This current implementation does not have any effect on 1qs on our backend, and we are taking a look at the proposed solutions to identify a proper fix. I would not be so confident to say that it fixes it as with the changes mentioned here , I was not able to get the correct runs when compared with nv .
I would also request to avoid using strong language on public PRs if the intention is to collaborate, and in terms of the code it was an older version of your implementation which actually did not give any results on our backend and we are investigating a proper solution.

* sycl : try to fix after IQ1_S changes * sycl : iq1s_grid -> iq1s_grid_gpu * sycl : fix grid type

sycl : try to fix after IQ1_S changes

77d586f

ggerganov force-pushed the gg/try-fix-sycl-iq1_s branch from 8c9f7d9 to 77d586f Compare March 11, 2024 11:49

ggerganov added 2 commits March 11, 2024 14:50

sycl : iq1s_grid -> iq1s_grid_gpu

cb5a702

sycl : fix grid type

76be02a

ggerganov marked this pull request as ready for review March 11, 2024 13:25

ggerganov requested a review from abhilash1910 March 11, 2024 13:25

ggerganov marked this pull request as draft March 11, 2024 16:17

abhilash1910 reviewed Mar 12, 2024

View reviewed changes

ggml-sycl.cpp Show resolved Hide resolved

ggerganov commented Mar 12, 2024

View reviewed changes

ggerganov force-pushed the gg/try-fix-sycl-iq1_s branch from af93a92 to 76be02a Compare March 12, 2024 09:13

ggerganov marked this pull request as ready for review March 12, 2024 09:14

ggerganov merged commit 48358b2 into master Mar 12, 2024
102 of 107 checks passed

NeoZhangJianyu pushed a commit to NeoZhangJianyu/llama.cpp that referenced this pull request Mar 12, 2024

sycl : update IQ1_S kernels (WIP - not working!) (ggerganov#5995)

662211b

* sycl : try to fix after IQ1_S changes * sycl : iq1s_grid -> iq1s_grid_gpu * sycl : fix grid type

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024

sycl : update IQ1_S kernels (WIP - not working!) (ggerganov#5995)

64dc622

* sycl : try to fix after IQ1_S changes * sycl : iq1s_grid -> iq1s_grid_gpu * sycl : fix grid type

ikawrakow reviewed Mar 28, 2024

View reviewed changes

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

sycl : update IQ1_S kernels (WIP - not working!) (ggerganov#5995)

7ea6c12

* sycl : try to fix after IQ1_S changes * sycl : iq1s_grid -> iq1s_grid_gpu * sycl : fix grid type

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sycl : try to fix SYCL after IQ1_S changes #5995

sycl : try to fix SYCL after IQ1_S changes #5995

ggerganov commented Mar 11, 2024

ggerganov commented Mar 11, 2024

abhilash1910 commented Mar 11, 2024

ggerganov Mar 12, 2024

abhilash1910 Mar 12, 2024 •

edited

Loading

ikawrakow left a comment

ikawrakow Mar 28, 2024

abhilash1910 Mar 28, 2024

sycl : try to fix SYCL after IQ1_S changes #5995

sycl : try to fix SYCL after IQ1_S changes #5995

Conversation

ggerganov commented Mar 11, 2024

ggerganov commented Mar 11, 2024

abhilash1910 commented Mar 11, 2024

ggerganov Mar 12, 2024

Choose a reason for hiding this comment

abhilash1910 Mar 12, 2024 • edited Loading

Choose a reason for hiding this comment

ikawrakow left a comment

Choose a reason for hiding this comment

ikawrakow Mar 28, 2024

Choose a reason for hiding this comment

abhilash1910 Mar 28, 2024

Choose a reason for hiding this comment

abhilash1910 Mar 12, 2024 •

edited

Loading