Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process crash due to white spaces in grammar file. #4376

Closed
4 tasks done
darxkies opened this issue Dec 8, 2023 · 9 comments
Closed
4 tasks done

Process crash due to white spaces in grammar file. #4376

darxkies opened this issue Dec 8, 2023 · 9 comments

Comments

@darxkies
Copy link

darxkies commented Dec 8, 2023

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Use grammar files containing white spaces.

Current Behavior

When using a grammar file and the rules contain ' ' or \n then main and server crash with the following error:

GGML_ASSERT: llama.cpp:7776: !grammar->stacks.empty()
ptrace: Operation not permitted.
No stack.
The program is not being run.

Environment and Context

CPU: i9-12900H
OS: Arch

Steps to Reproduce

make clean; make LLAMA_CUBLAS=1 LLAMA_SERVER_VERBOSE=1 -j 20 main; ./main -b 4096 -c 4096 -m models/dolphin-2.2.1-mistral-7b.Q8_0.gguf -f prompts/react.txt --grammar-file grammars/json_arr.gbnf

b1613 was the latest tag that used to work properly. Afterwards it keeps crashing.

It is not related to the model. It failed with several 7b/13b models.

@AlienKevin
Copy link
Contributor

AlienKevin commented Dec 8, 2023

I'm also experiencing this issue on the latest commit (fe680e3).
Grammar sampling was working fine in the Oct 28 commit (ff3bad8) but after I switched to the latest commit, the generated outputs become gibberish. My grammar also contains whitespace, which I suspect might be the issue.

My Grammar
root ::= [ \t\n]* exp

ws ::= [ \t\n]+
w ::= [ \t]*

comment ::= "#" [^#]* "#" [ \t]+ [\n]? [ \t]*

### Expressions

exp ::= comment* sequence-exp

sequence-exp ::= tuple-exp (w ";" ws tuple-exp)*

tuple-exp ::= cons-exp (w "," ws cons-exp)*

cons-exp ::= binary-exp (w "::" w binary-exp)*

binary-exp ::= unary-exp (ws binary-op ws unary-exp)*

unary-exp ::= unary-op* function-app-exp

function-app-exp ::= primary-exp (w "(" w exp w ")" w)*

primary-exp ::= bool |
    integer |
    float |
    string |
    variable |
    "()" |
    "[]" |
    constructor |
    constructor-app |
    parenthesized-exp |
    list-exp |
    let-exp |
    if-exp |
    case-exp |
    test-exp |
    type-alias |
    fun

constructor-app ::= constructor "(" w exp w ")"
parenthesized-exp ::= "(" w exp w ")"
list-exp ::= "[" exp ("," ws exp)* "]"
let-exp ::= "let" ws pat ws "=" ws exp ws "in" ws exp
if-exp ::= "if" ws exp ws "then" ws exp ws "else" ws exp
case-exp ::= "case" ws exp (ws "|" ws pat ws "=>" ws exp)+ ws "end"
test-exp ::= "test" ws exp ws "end"
type-alias ::= "type" ws constructor ws "=" ws typ ws "in" ws exp
fun ::= "fun" ws pat ws "->" ws exp

type-variable ::= [a-z][A-Za-z0-9_]*
constructor ::= [A-Z][A-Za-z0-9_]*
variable ::= ([_a-bdg-hj-kn-qu-z][A-Za-z0-9_.]*)|(("s" ([.0-9A-Z_a-su-z][A-Za-z0-9_.]*)?)|("st" ([.0-9A-Z_a-qs-z][A-Za-z0-9_.]*)?)|("str" ([.0-9A-Z_a-tv-z][A-Za-z0-9_.]*)?)|("stru" ([.0-9A-Z_a-bd-z][A-Za-z0-9_.]*)?)|("struc" ([.0-9A-Z_a-su-z][A-Za-z0-9_.]*)?)|("struct" [A-Za-z0-9_.]+)|("c" ([.0-9A-Z_b-z][A-Za-z0-9_.]*)?)|("ca" ([.0-9A-Z_a-rt-z][A-Za-z0-9_.]*)?)|("cas" ([.0-9A-Z_a-df-z][A-Za-z0-9_.]*)?)|("case" [A-Za-z0-9_.]+)|("i" ([.0-9A-Z_a-mo-z][A-Za-z0-9_.]*)?)|("in" [A-Za-z0-9_.]+)|("r" ([.0-9A-Z_a-df-z][A-Za-z0-9_.]*)?)|("re" ([.0-9A-Z_a-bd-z][A-Za-z0-9_.]*)?)|("rec" [A-Za-z0-9_.]+)|("t" ([.0-9A-Z_a-df-z][A-Za-z0-9_.]*)?)|("te" ([.0-9A-Z_a-rt-z][A-Za-z0-9_.]*)?)|("tes" ([.0-9A-Z_a-su-z][A-Za-z0-9_.]*)?)|("test" [A-Za-z0-9_.]+)|("l" ([.0-9A-Z_a-df-z][A-Za-z0-9_.]*)?)|("le" ([.0-9A-Z_a-su-z][A-Za-z0-9_.]*)?)|("let" [A-Za-z0-9_.]+)|("m" ([.0-9A-Z_b-z][A-Za-z0-9_.]*)?)|("ma" ([.0-9A-Z_a-su-z][A-Za-z0-9_.]*)?)|("mat" ([.0-9A-Z_a-bd-z][A-Za-z0-9_.]*)?)|("matc" ([.0-9A-Z_a-gi-z][A-Za-z0-9_.]*)?)|("match" [A-Za-z0-9_.]+)|("f" ([.0-9A-Z_a-tv-z][A-Za-z0-9_.]*)?)|("fu" ([.0-9A-Z_a-mo-z][A-Za-z0-9_.]*)?)|("fun" [A-Za-z0-9_.]+)|("e" ([.0-9A-Z_a-mo-z][A-Za-z0-9_.]*)?)|("en" ([.0-9A-Z_a-ce-z][A-Za-z0-9_.]*)?)|("end" [A-Za-z0-9_.]+))
bool ::= "true" | "false"
integer ::= [0-9]+
float ::= [0-9]* "." [0-9]+
string ::= "\"" [^"]* "\""

unary-op ::= "-" | "!"
binary-op-int ::= "+" | "-" | "*" | "/" | "<" | ">" | "<=" | ">=" | "==" | "!="
binary-op-float ::= "+." | "-." | "*." | "/." | "<." | ">." | "<=." | ">=." | "==." | "!=."
binary-op-string ::= "$==" | "@"
binary-op-logic ::= "&&"
binary-op ::= binary-op-int | binary-op-float | binary-op-string | binary-op-logic

### Patterns

pat ::= type-ascription-pat

type-ascription-pat ::= tuple-pat (w ":" ws typ)*

tuple-pat ::= cons-pat (w "," ws cons-pat)*

cons-pat ::= primary-pat (w "::" w primary-pat)*

primary-pat ::=
    bool |
    integer |
    float |
    string |
    variable |
    "()" |
    "[]" |
    "_" |
    constructor |
    constructor-app-pat |
    parenthesized-pat |
    list-pat

constructor-app-pat ::= constructor "(" w pat w ")"
parenthesized-pat ::= "(" w pat w ")"
list-pat ::= "[" pat (w "," ws pat)* "]"

### Types

typ ::= arrow-typ

arrow-typ ::= tuple-typ (ws "->" ws tuple-typ)*

tuple-typ ::= primary-typ (w "," ws primary-typ)*

primary-typ ::=
    "Unit" |
    "Int" |
    "Float" |
    "Bool" |
    "String" |
    type-variable |
    constructor |
    constructor-def (ws "+" ws constructor-def)+ |
    parenthesized-typ |
    list-typ

parenthesized-typ ::= "(" w typ w ")"
list-typ ::= "[" w typ w "]"
constructor-def ::= constructor | constructor "(" w typ w ")"

@AlienKevin
Copy link
Contributor

AlienKevin commented Dec 8, 2023

I tracked the source of the issue down to the commit 5f6e0c0. The issue originates from an optimization which replaced llama_token_to_piece with model.vocab.id_to_token[id].text. This causes problem with models, like CodeLlama, that use the SentencePiece tokenizer because they require llama_unescape_whitespace to unescape whitespace.

llama.cpp/llama.cpp

Lines 9997 to 10008 in 5f6e0c0

int llama_token_to_piece(const struct llama_model * model, llama_token token, char * buf, int length) {
if (0 <= token && token < llama_n_vocab(model)) {
switch (llama_vocab_get_type(model->vocab)) {
case LLAMA_VOCAB_TYPE_SPM: {
if (llama_is_normal_token(model->vocab, token)) {
std::string result = model->vocab.id_to_token[token].text;
llama_unescape_whitespace(result);
if (length < (int) result.length()) {
return -result.length();
}
memcpy(buf, result.c_str(), result.length());
return result.length();

Reverting the changes in llama_sample_grammar and llama_grammar_accept_token fixed the issue for me:

// const std::string & piece = ctx->model.vocab.id_to_token[id].text;
const std::string piece = llama_token_to_piece(ctx, id);

@darxkies Can you check if reverting back to llama_token_to_piece fix the issue you experienced?

@mastermindbaguvix
Copy link

@AlienKevin, I can confirm that this fixes the issue. I experienced same problem with Emerhyst 20B and Llama 2 Chat 13B.

@darxkies
Copy link
Author

darxkies commented Dec 9, 2023

@@ -7554,11 +7546,13 @@ void llama_sample_grammar(struct llama_context * ctx, llama_token_data_array * c
     const llama_token eos = llama_token_eos(&ctx->model);
 
     std::vector<std::pair<std::vector<uint32_t>, llama_partial_utf8>> candidates_decoded;
+    candidates_decoded.reserve(candidates->size);
     std::vector<llama_grammar_candidate>                              candidates_grammar;
+    candidates_grammar.reserve(candidates->size);
 
     for (size_t i = 0; i < candidates->size; ++i) {
         const llama_token id    = candidates->data[i].id;
-        const std::string piece = llama_token_to_piece(ctx, id);
+        const std::string & piece = ctx->model.vocab.id_to_token[id].text;
         if (id == eos) {
             if (!allow_eos) {
                 candidates->data[i].logit = -INFINITY;
@@ -7770,7 +7764,7 @@ void llama_grammar_accept_token(struct llama_context * ctx, struct llama_grammar
         GGML_ASSERT(false);
     }
 
-    const std::string piece = llama_token_to_piece(ctx, token);
+    const std::string & piece = ctx->model.vocab.id_to_token[token].text;
 
     // Note terminating 0 in decoded string
     const auto   decoded     = decode_utf8(piece, grammar->partial_utf8);

@AlienKevin By restoring the two last changes from the patch above that were introduced in b1614, it works again.

@darxkies
Copy link
Author

darxkies commented Dec 9, 2023

Is anyone interested in a pull request to revert the two lines?

@AlienKevin
Copy link
Contributor

I think besides the PR, we might benefit from several regression tests for grammar sampling. The current test for grammar sampling seems to be manually written and quite hard to read. Maybe we can add some simpler tests based on larger grammars that just ensures that the sampling process never fails? I'll be happy to contribute my grammar as a test case.

@AlienKevin
Copy link
Contributor

Submitted a PR with the two line fix: #4396

@darxkies
Copy link
Author

darxkies commented Dec 9, 2023

It would suffice if it was tested against the included grammar files I think. That is what I used to reproduce the bug.

There is another bug I found recently that I have to submit and causes the OS to kill it.

@darxkies
Copy link
Author

With the latest commit it works again as intended.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants