New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Implementations for Q4_0_8_8 quantization based functions in AVX2 SIMD architecture #8713

Merged

ggerganov merged 4 commits into ggerganov:master from Srihari-mcw:block_interleaving_q4_0_8_8_avx2_implementation

Sep 4, 2024

Commits on Sep 4, 2024

Add AVX2 based implementations for quantize_q8_0_4x8, ggml_gemv_q4_0_…
```
…8x8_q8_0 and ggml_gemm_q4_0_8x8_q8_0 functions
```
Srihari-mcw authored and Srihari-mcw committed Sep 4, 2024
Configuration menu
View commit details

Copy full SHA for 0c81b7b

Browse repository at this point
Copy the full SHA

0c81b7b View commit details

Browse the repository at this point in the history
Update code to fix issues occuring due to non alignment of elements t…
```
…o be processed as multiple of 16 in MSVC
```
Srihari-mcw authored and Srihari-mcw committed Sep 4, 2024
Configuration menu
View commit details

Copy full SHA for 49af3f5

Browse repository at this point
Copy the full SHA

49af3f5 View commit details

Browse the repository at this point in the history
Update comments and indentation

Srihari-mcw authored and Srihari-mcw committed Sep 4, 2024
Configuration menu
View commit details

Copy full SHA for 364dc96

Browse repository at this point
Copy the full SHA

364dc96 View commit details

Browse the repository at this point in the history
Make updates to reduce number of load instructions

Srihari-mcw authored and Srihari-mcw committed Sep 4, 2024
Configuration menu
View commit details

Copy full SHA for c950fc3

Browse repository at this point
Copy the full SHA

c950fc3 View commit details

Browse the repository at this point in the history