-
I couldn't find anything in the docs regarding catch this error I'm getting, which I'm guessing means that my request is using too many tokens: ...............................................................................................
llama_new_context_with_model: n_ctx = 4096
llama_new_context_with_model: freq_base = 1000000.0
llama_new_context_with_model: freq_scale = 1
llama_new_context_with_model: kv self size = 512.00 MB
llama_new_context_with_model: compute buffer total size = 294.13 MB
GGML_ASSERT: /my-project/node_modules/node-llama-cpp/llama/llama.cpp/llama.cpp:5867: n_tokens <= n_batch
zsh: abort npm run dev
(base) me@computer my-project % I tried wrapping the instantiation of Edit: this is running on macOS with an x86 CPU. |
Beta Was this translation helpful? Give feedback.
Answered by
giladgd
Nov 12, 2023
Replies: 1 comment
-
There's currently an issue with prompts that are longer than the |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
giladgd
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
There's currently an issue with prompts that are longer than the
batchSize
; it'll be fixed as part of #85.For a workaround for now, see #76