Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[auto] Sync version 2312020046.0.0+llamacpp-release.b1601
== Relevant log messages from source repo: commit 5a7d3125e7c24f223659b7f0b7aa7736986e92c0 Author: Georgi Gerganov <[email protected]> Date: Fri Dec 1 20:39:12 2023 +0200 llama : avoid using "optional" keyword (#4283) commit d5a1cbde60531d02ac74da27ea355182e3a4d516 Author: Georgi Gerganov <[email protected]> Date: Fri Dec 1 20:35:03 2023 +0200 llama : support optional tensors (#4283) commit 511f52c334e37033f9c9de07b98fca4abc9470bd Author: Jared Van Bortel <[email protected]> Date: Fri Dec 1 13:18:35 2023 -0500 build : enable libstdc++ assertions for debug builds (#4275) commit 03562f3a86d6706eea9f4fc09b532946c191b34e Author: CausalLM <[email protected]> Date: Sat Dec 2 02:17:06 2023 +0800 llama : support attention bias on LLaMA architecture (#4283) * Support attention_bias on LLaMA architecture QKVO bias, should fix InternLM (ggerganov/llama.cpp#3133) and works for LLaMAfied Qwen models (ggerganov/llama.cpp#3743 (comment)). * check existence of qkvo bias while loading llama models Tested on LLaMA2, CUDA and CPU. * Update llama.cpp commit 37c746d687d877bc11803e96b4dc5f378b83c0a0 Author: Shijie <[email protected]> Date: Sat Dec 2 02:16:31 2023 +0800 llama : add Qwen support (#4281) * enable qwen to llama.cpp * llama : do not GPU split bias tensors --------- Co-authored-by: Georgi Gerganov <[email protected]>
- Loading branch information