Releases: ggerganov/llama.cpp
Releases · ggerganov/llama.cpp
b2948
[SYCL] Update SYCL upscale operation (#7321) * Update SYCL upscale operation * Formatting * Remove messages
b2946
ggml-opencl, llama: using reserve() if count already known (#7272)
b2945
ggml : add loongarch lsx and lasx support (#6454) * add loongarch lsx and lasx optimize code * Add loongarch compilation support to makefile * revert stb_image.h * opt bytes_from_nibbles_32 and sum_i16_pairs_float * fix undeclared * format code * update * update 2 --------- Co-authored-by: Jinyang He <[email protected]>
b2943
server : return error on too large embedding input (#7389)
b2941
Add provisions for windows support for BF16 code including CMake prov…
b2940
llama : remove MPI backend (#7395)
b2939
quantize : fix --keep-split check (#7374)
b2938
Vulkan Embedding Fix (#7360) * Fix empty Vulkan host buffers Add fp32 fp16 matmul shader Fix matmul shader alignment * Remove deprecated tensor->backend uses * Fix Vulkan validation errors on embedding models with no offloaded layers * Fix Vulkan llava segfault when not offloading layers
b2937
ggml : fix another case of quants nans (#7387)
b2936
ggml: implement quantized KV cache for FA (#7372)