-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Issues: ggerganov/llama.cpp
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Misc. bug: Serving of custom static files is broken when API key is set.
bug-unconfirmed
#10475
opened Nov 24, 2024 by
shibe2
Misc. bug: poor concurrent request performance with llama-server in macOS
bug-unconfirmed
#10473
opened Nov 24, 2024 by
pengjiang80
Feature Request: better cross entropy loss CUDA kernel
enhancement
New feature or request
#10467
opened Nov 23, 2024 by
JohannesGaessler
4 tasks done
Misc. bug: Model provisioning doc link broken
bug-unconfirmed
#10464
opened Nov 23, 2024 by
paoletto
Compile bug: how to modify the cmakelists.txt
bug-unconfirmed
#10462
opened Nov 23, 2024 by
wangzd0209
Support for Macro-o1 by alibaba
enhancement
New feature or request
#10461
opened Nov 23, 2024 by
Meshwa428
4 tasks done
ggml : add ANE backend
help wanted
Extra attention is needed
research 🔬
#10453
opened Nov 22, 2024 by
ggerganov
Bug: 【CANN】ggml-cann/aclnn_ops.cpp:3007: GGML_ASSERT(n_dims == src0->ne[0]) failed
bug-unconfirmed
critical severity
Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)
#10451
opened Nov 22, 2024 by
zyp2
Bug: Heavy throttling during token generation on Apple Silicon
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#10444
opened Nov 21, 2024 by
Azirine
Bug: Flash Attention performs worse under ROCM
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#10439
opened Nov 20, 2024 by
Mushoz
Bug: Severe Performance Degradation on Q4_0 CPU-only with MacOS / Apple Silicon M2, after PR#9921 / Version 4081
bug
Something isn't working
#10435
opened Nov 20, 2024 by
AndreasKunar
Why server slot's cache_prompt is false by default?
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#10427
opened Nov 20, 2024 by
Nekotekina
Bug: SYCL builds >= b4069 fail to allocate SYCL0 buffer
bug-unconfirmed
critical severity
Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)
#10421
opened Nov 20, 2024 by
0xDEADFED5
Bug: Vulkan vk::DeviceLostError with multithreaded environment
bug-unconfirmed
low severity
Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
#10420
opened Nov 20, 2024 by
ddwkim
Bug: run llama.cpp failed with Vulkan-supported and quantized model in Android Termux
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#10406
opened Nov 19, 2024 by
linxhome
Feature Request: Code Explanation Tutoria
enhancement
New feature or request
#10399
opened Nov 19, 2024 by
Tangzhongyi834
4 tasks done
Bug: Server hangs when number of threads used for decoding > number of CPUs it runs on
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#10397
opened Nov 19, 2024 by
KevinRSX
Feature Request: [CANN] Use the RoPE operator provided by aclnn
enhancement
New feature or request
#10396
opened Nov 19, 2024 by
noemotiovon
4 tasks done
Qwen 32B: server breaks stream abruptly when above 9K context
bug-unconfirmed
low severity
Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
#10393
opened Nov 18, 2024 by
JeroenAdam
Refactor: Allow adding both tokens and embeddings to
llama_batch
#10381
opened Nov 18, 2024 by
ngxson
Bug: flash-attn can't use
bug-unconfirmed
low severity
Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
#10378
opened Nov 18, 2024 by
Tangzhongyi834
Feature Request: Apply LoRA adapters per-request
enhancement
New feature or request
#10377
opened Nov 18, 2024 by
ngxson
4 tasks done
Previous Next
ProTip!
no:milestone will show everything without a milestone.