Newlines in generation when using grammar #637

rlancemartin · 2023-08-24T21:23:05Z

Using llama-cpp-python w/ LangChain integration and this PR to support grammars.

Test w/o grammar_path:

llm = LlamaCpp(
    model_path="/Users/rlm/Desktop/Code/llama.cpp/llama-2-13b-chat.ggmlv3.q4_0.bin",
    n_gpu_layers=n_gpu_layers,
    n_batch=n_batch,
    callback_manager=callback_manager,
    verbose=True,
)
question = "What NFL team won the Super Bowl in the year Justin Bieber was born?"
llm(question)

The result is as expected.

Test w/ grammar_path:

llm = LlamaCpp(
    model_path="/Users/rlm/Desktop/Code/llama.cpp/llama-2-13b-chat.ggmlv3.q4_0.bin",
    n_gpu_layers=n_gpu_layers,
    n_batch=n_batch,
    f16_kv=True,  # MUST set to True, otherwise you will run into problem after a couple of calls
    callback_manager=callback_manager,
    verbose=True,
    grammar_path="/Users/rlm/Desktop/json.gbnf",
)
question = "Request: schedule a call at 8pm; Command:"
llm(question)

The result has a large number of newlines:

'{"schedule": {"date": "2018-09-14T20:00:00.000Z", "duration": 60}}\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n'

Has anyone seen / resolved similar behavior?

The text was updated successfully, but these errors were encountered:

abetlen · 2023-08-24T21:51:14Z

@rlancemartin the grammar only specifies the syntax of the output not necessarily the stopping condition, if the model doesn't generate an EOS token and no other stopping criteria is met "\n" is the only valid character at the end of the generation. You either need to pass something to the stop list or using StoppingCriteria that checks if the output is parseable using json.loads.

rlancemartin · 2023-08-24T22:04:15Z

@rlancemartin the grammar only specifies the syntax of the output not necessarily the stopping condition, if the model doesn't generate an EOS token and no other stopping criteria is met "\n" is the only valid character at the end of the generation. You either need to pass something to the stop list or using StoppingCriteria that checks if the output is parseable using json.loads.

Thanks. Yes, this works.

# Make sure the model path is correct for your system!
llm = LlamaCpp(
    model_path="/Users/rlm/Desktop/Code/llama.cpp/llama-2-13b-chat.ggmlv3.q4_0.bin",
    n_gpu_layers=n_gpu_layers,
    n_batch=n_batch,
    f16_kv=True,  # MUST set to True, otherwise you will run into problem after a couple of calls
    callback_manager=callback_manager,
    verbose=True,
    grammar_path="/Users/rlm/Desktop/json.gbnf",
    stop=["STOP"]
)

Prompt w/ STOP token specified:

template = """Print 'STOP' when you are finished answering the question. Question: {question}"""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "Request: schedule a call at 8pm; Command:"
llm_chain.run(question)

Result, as expected:

'{"type": "request", "message": "Hello! I would like to request your availability for a call tonight at 8pm. Would you be available?", "tones": [{"id": "polite", "name": "Polite"}]}'

Is there a best-practice for this? (I'm just using STOP as a test-case.)

rlancemartin · 2023-08-24T22:39:01Z

OK, I think I get it a bit further:

The problem seems to be with the json.gbnf specifically.

I'm working on modifying that file.

AndreaRiboni · 2024-02-06T14:49:53Z

any update?

rlancemartin mentioned this issue Aug 24, 2023

feat: grammar-based sampling in llama-cpp langchain-ai/langchain#9712

Merged

rlancemartin mentioned this issue Aug 24, 2023

Enable grammars for JSON, list w/ LLaMA langchain-ai/langchain#9727

Closed

gjmulder added model Model specific issue quality Quality of model output labels Sep 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Newlines in generation when using grammar #637

Newlines in generation when using grammar #637

rlancemartin commented Aug 24, 2023 •

edited

Loading

abetlen commented Aug 24, 2023

rlancemartin commented Aug 24, 2023 •

edited

Loading

rlancemartin commented Aug 24, 2023

AndreaRiboni commented Feb 6, 2024

Newlines in generation when using grammar #637

Newlines in generation when using grammar #637

Comments

rlancemartin commented Aug 24, 2023 • edited Loading

abetlen commented Aug 24, 2023

rlancemartin commented Aug 24, 2023 • edited Loading

rlancemartin commented Aug 24, 2023

AndreaRiboni commented Feb 6, 2024

rlancemartin commented Aug 24, 2023 •

edited

Loading

rlancemartin commented Aug 24, 2023 •

edited

Loading