You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
GGML v3 models should load just fine. As per this post, this type of error should have been resolved.
In particular, models like LLaMa-13B-GGML/llama-13b.ggmlv3.q5_1.bin or llama-13b.ggmlv3.q6_K.bin should load.
Current Behavior
However, instead, I'm getting this error
ggml_metal_add_buffer: buffer 'data' size 9763717120 is larger than buffer maximum of 8589934592
llama_init_from_file: failed to add buffer
Environment and Context
llama-cpp-python 0.1.67
M2 Pro
16 GB
Macosx
Darwin UAVALOS-M-NR30 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun 8 22:22:23 PDT 2023; root:xnu-8796.121.3~7/RELEASE_ARM64_T6020 arm64
Python 3.10.10
GNU Make 3.81
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
LLaMa-13B-GGML/llama-13b.ggmlv3.q5_1.bin
orllama-13b.ggmlv3.q6_K.bin
should load.Current Behavior
Environment and Context
Failure Information (for bugs)
See above
Steps to Reproduce
Failure Logs
The text was updated successfully, but these errors were encountered: