Skip to content

Releases: MarcusDunn/llama.cpp

b2582

31 Mar 22:09
c50a82c
Compare
Choose a tag to compare
readme : update hot topics

b2293

28 Feb 18:15
08c5ee8
Compare
Choose a tag to compare
llama : remove deprecated API (#5770)

ggml-ci

b1978

26 Jan 17:16
5f1925a
Compare
Choose a tag to compare
scripts : move run-with-preset.py from root to scripts folder

b1680

21 Dec 21:44
afefa31
Compare
Choose a tag to compare
ggml : change ggml_scale to take a float instead of tensor (#4573)

* ggml : change ggml_scale to take a float instead of tensor

* ggml : fix CPU implementation

* tests : fix test-grad0

ggml-ci

b1663

21 Dec 00:13
799fc22
Compare
Choose a tag to compare
CUDA: Faster Mixtral prompt processing (#4538)

* CUDA: make MoE tensors contiguous for batch size>1

* Update ggml-cuda.cu

Co-authored-by: slaren <[email protected]>

---------

Co-authored-by: slaren <[email protected]>

b1662

19 Dec 23:14
328b83d
Compare
Choose a tag to compare
ggml : fixed check for _MSC_VER (#4535)

Co-authored-by: Eric Sommerlade <[email protected]>