Releases · ggerganov/llama.cpp

20 May 13:44

6bf9b66

b2948

[SYCL] Update SYCL upscale operation (#7321)

* Update SYCL upscale operation

* Formatting

* Remove messages

Assets 21

20 May 11:08

github-actions

b2946

213e90e

b2946

ggml-opencl, llama: using reserve() if count already known (#7272)

Assets 21

20 May 10:34

github-actions

b2945

65c5820

b2945

ggml : add loongarch lsx and lasx support (#6454)

* add loongarch lsx and lasx optimize code

* Add loongarch compilation support to makefile

* revert stb_image.h

* opt bytes_from_nibbles_32 and sum_i16_pairs_float

* fix undeclared

* format code

* update

* update 2

---------

Co-authored-by: Jinyang He <[email protected]>

Assets 21

20 May 07:08

github-actions

b2943

e932094

b2943

server : return error on too large embedding input (#7389)

Assets 21

20 May 04:33

github-actions

b2941

33c8d50

b2941

Add provisions for windows support for BF16 code including CMake prov…

Assets 21

20 May 00:29

github-actions

b2940

d359f30

b2940

llama : remove MPI backend (#7395)

Assets 21

19 May 23:34

github-actions

b2939

1ea2a00

b2939

quantize : fix --keep-split check (#7374)

Assets 21

19 May 23:15

github-actions

b2938

f030ec1

b2938

Vulkan Embedding Fix (#7360)

* Fix empty Vulkan host buffers

Add fp32 fp16 matmul shader

Fix matmul shader alignment

* Remove deprecated tensor->backend uses

* Fix Vulkan validation errors on embedding models with no offloaded layers

* Fix Vulkan llava segfault when not offloading layers

Assets 21

19 May 22:19

github-actions

b2937

e4e6f67

b2937

ggml : fix another case of quants nans (#7387)

Assets 21

19 May 22:19

github-actions

b2936

5ca49cb

b2936

ggml: implement quantized KV cache for FA (#7372)

Assets 21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ggerganov/llama.cpp

b2948

b2946

b2945

b2943

b2941

b2940

b2939

b2938

b2937

b2936