buffer 'data' size 9763717120 is larger than buffer maximum of 8589934592 #438

frankandrobot · 2023-06-30T14:21:30Z

Prerequisites

Please answer the following questions for yourself before submitting an issue.

I am running the latest code. Development is very rapid so there are no tagged versions as of now.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

GGML v3 models should load just fine. As per this post, this type of error should have been resolved.
In particular, models like LLaMa-13B-GGML/llama-13b.ggmlv3.q5_1.bin or llama-13b.ggmlv3.q6_K.bin should load.

Current Behavior

However, instead, I'm getting this error

ggml_metal_add_buffer: buffer 'data' size 9763717120 is larger than buffer maximum of 8589934592
llama_init_from_file: failed to add buffer

Environment and Context

llama-cpp-python         0.1.67
M2 Pro
16 GB
Macosx

Darwin UAVALOS-M-NR30 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun  8 22:22:23 PDT 2023; root:xnu-8796.121.3~7/RELEASE_ARM64_T6020 arm64

Python 3.10.10
GNU Make 3.81
Apple clang version 14.0.3 (clang-1403.0.22.14.1)

Failure Information (for bugs)

See above

Steps to Reproduce

!pip uninstall llama-cpp-python -y
!CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir
!pip install 'llama-cpp-python[server]'

from langchain.llms import LlamaCpp
from langchain import PromptTemplate, LLMChain
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

model_path = "/Users/uavalos/Documents/LLaMa-13B-GGML/llama-13b.ggmlv3.q5_1.bin";
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

n_gpu_layers = 1
n_batch = 512 

llm = LlamaCpp(
    model_path=model_path,
    n_gpu_layers=n_gpu_layers, n_batch=n_batch,
    callback_manager=callback_manager, 
    verbose=True,
    n_ctx=1100,
)

Failure Logs

ggml_metal_add_buffer: buffer 'data' size 9763717120 is larger than buffer maximum of 8589934592
llama_init_from_file: failed to add buffer

The text was updated successfully, but these errors were encountered:

gjmulder added the llama.cpp Problem with llama.cpp shared lib label Jul 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer 'data' size 9763717120 is larger than buffer maximum of 8589934592 #438

buffer 'data' size 9763717120 is larger than buffer maximum of 8589934592 #438

frankandrobot commented Jun 30, 2023

buffer 'data' size 9763717120 is larger than buffer maximum of 8589934592 #438

buffer 'data' size 9763717120 is larger than buffer maximum of 8589934592 #438

Comments

frankandrobot commented Jun 30, 2023

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Failure Logs