Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer 'data' size 9763717120 is larger than buffer maximum of 8589934592 #438

Open
4 tasks done
frankandrobot opened this issue Jun 30, 2023 · 0 comments
Open
4 tasks done
Labels
llama.cpp Problem with llama.cpp shared lib

Comments

@frankandrobot
Copy link

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

  • GGML v3 models should load just fine. As per this post, this type of error should have been resolved.
  • In particular, models like LLaMa-13B-GGML/llama-13b.ggmlv3.q5_1.bin or llama-13b.ggmlv3.q6_K.bin should load.

Current Behavior

  • However, instead, I'm getting this error
ggml_metal_add_buffer: buffer 'data' size 9763717120 is larger than buffer maximum of 8589934592
llama_init_from_file: failed to add buffer

Environment and Context

llama-cpp-python         0.1.67
M2 Pro
16 GB
Macosx

Darwin UAVALOS-M-NR30 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun  8 22:22:23 PDT 2023; root:xnu-8796.121.3~7/RELEASE_ARM64_T6020 arm64

Python 3.10.10
GNU Make 3.81
Apple clang version 14.0.3 (clang-1403.0.22.14.1)

Failure Information (for bugs)

See above

Steps to Reproduce

!pip uninstall llama-cpp-python -y
!CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir
!pip install 'llama-cpp-python[server]'

from langchain.llms import LlamaCpp
from langchain import PromptTemplate, LLMChain
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

model_path = "/Users/uavalos/Documents/LLaMa-13B-GGML/llama-13b.ggmlv3.q5_1.bin";
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

n_gpu_layers = 1
n_batch = 512 

llm = LlamaCpp(
    model_path=model_path,
    n_gpu_layers=n_gpu_layers, n_batch=n_batch,
    callback_manager=callback_manager, 
    verbose=True,
    n_ctx=1100,
)

Failure Logs

ggml_metal_add_buffer: buffer 'data' size 9763717120 is larger than buffer maximum of 8589934592
llama_init_from_file: failed to add buffer
@gjmulder gjmulder added the llama.cpp Problem with llama.cpp shared lib label Jul 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llama.cpp Problem with llama.cpp shared lib
Projects
None yet
Development

No branches or pull requests

2 participants