Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sycl : try to fix SYCL after IQ1_S changes #5995

Merged
merged 3 commits into from
Mar 12, 2024
Merged

Conversation

ggerganov
Copy link
Owner

No description provided.

@ggerganov ggerganov marked this pull request as ready for review March 11, 2024 13:25
@ggerganov
Copy link
Owner Author

Could anyone with SYCL hardware run test-backend-ops and check if this implementation works correctly?

@abhilash1910
Copy link
Collaborator

Thanks @ggerganov for adding the new changes , there are some changes needed on it. I will apply once i am near my laptop.

@ggerganov ggerganov marked this pull request as draft March 11, 2024 16:17
ggml-sycl.cpp Outdated
const float delta = x[i].qh[ib] & 0x8000 ? -1 - IQ1S_DELTA : -1 + IQ1S_DELTA;
const float d = (float)x[i].d * (2*((x[i].qh[ib] >> 12) & 7) + 1);
uint32_t grid32[2]; const int8_t * q = (const int8_t *)grid32;
grid32[0] = iq1s_grid_gpu[x[i].qs[4*ib+il] | (((x[i].qh[ib] >> 3*il) & 7) << 8)];
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, we can't use iq1s_grid_gpu constant table directly - that is why we pass the iq1s_grid pointer to this kernel.

Btw, there have been more changes to IQ1_S (#5999) since I opened this PR that should also be reflected. But I don't have means to do tests, so implementing this blindly is difficult

My suggestion is to push empty IQ1_S kernels on master so that the CI becomes green and your team can work on implementing the kernels and verifying that test-backend-ops runs correctly

Copy link
Collaborator

@abhilash1910 abhilash1910 Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed , let me try to build a solution. Thanks for the pointer. In the meantime, we can revert the initial solution using grid1 & 2

@ggerganov ggerganov marked this pull request as ready for review March 12, 2024 09:14
@ggerganov ggerganov merged commit 48358b2 into master Mar 12, 2024
102 of 107 checks passed
NeoZhangJianyu pushed a commit to NeoZhangJianyu/llama.cpp that referenced this pull request Mar 12, 2024
* sycl : try to fix after IQ1_S changes

* sycl : iq1s_grid -> iq1s_grid_gpu

* sycl : fix grid type
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
* sycl : try to fix after IQ1_S changes

* sycl : iq1s_grid -> iq1s_grid_gpu

* sycl : fix grid type
Copy link
Contributor

@ikawrakow ikawrakow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this incorrect implementation of IQ1_S

const int8_t * grid = (const int8_t *)(iq1s_grid + (x[i].qs[i8] | ((h & 8) << 5)));
const float d = (float)x[i].d * (2*(h & 7) + 1);
for (int j = 0; j < 8; ++j) y[j] = d * grid[j];
const uint8_t * qs = x[i].qs + 8*ib;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong. There are no signs in IQ1_S and have never been. This bogus implementation has been sitting on the master branch for 2 weeks now. PR #6014 that actually fixes it, has been sitting unreviewed for 2 weeks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This current implementation does not have any effect on 1qs on our backend, and we are taking a look at the proposed solutions to identify a proper fix. I would not be so confident to say that it fixes it as with the changes mentioned here , I was not able to get the correct runs when compared with nv .
I would also request to avoid using strong language on public PRs if the intention is to collaborate, and in terms of the code it was an older version of your implementation which actually did not give any results on our backend and we are investigating a proper solution.

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
* sycl : try to fix after IQ1_S changes

* sycl : iq1s_grid -> iq1s_grid_gpu

* sycl : fix grid type
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants