Support LLama-2 prompt style. #5

tcaminel-pro · 2023-08-25T10:35:48Z

LLama-2-Chat is becoming a serious alternative to ChatGPT-3.5. However, the prompts must be structured in a special way to make it effective: see https://www.pinecone.io/learn/llama-2/

It could be quite easy I think for langchain-decorators to generate prompts that follow that style, while being compatible with ChatGPT. That would open lot of opportunities...

I did some try with the tags defind as fields in the model, but that could be more generic. It runs on https://deepinfra.com/ .
The JSON parsing fails sometimes, but I guess that could be easy to fix.

from pydantic import BaseModel, Field
from langchain.llms import DeepInfra
from langchain_decorators import llm_prompt, GlobalSettings

llama70 = DeepInfra(model_id="meta-llama/Llama-2-70b-chat-hf")
# llama70.model_kwargs = {"temperature": 0.001}

GlobalSettings.define_settings(default_llm=llama70)
class TheOutputStructureWeExpect(BaseModel):
    name: str = Field(description="The name of the company")
    headline: str = Field(
        description="The description of the company (for landing page)"
    )
    employees: list[str] = Field(
        description="5-8 fake employee names with their positions"
    )

PROMPT_STYLE = "Llama"
class FakeCompanyGenerator(BaseModel):
    B_SYS, E_SYS, B_INST, E_INST = (
        "<s> <<SYS>>\n",
        "\n<</SYS>>\n",
        "[INST]",
        "[/INST]" if PROMPT_STYLE == "Llama" else ("", "", "", ""),
    )

    @llm_prompt()
    def generate(self, company_business: str) -> TheOutputStructureWeExpect:
        """{B_SYS} You are a friendly consultant that only communicates using JSON files. {E_SYS}
        {B_INST}
        Generate a fake company that {company_business}
        Strictly follow the following JSON format instruction. Generate only one example output.
        {FORMAT_INSTRUCTIONS}
        {E_INST}"""
        return

company = FakeCompanyGenerator().generate(company_business="sells cookies")

ju-bezdek · 2023-08-28T06:57:37Z

Hey, thanks for the suggestion.

I was thinking about it during the weekend. The thing is, that I would assume that this would be handled on a lower level at langchain.

For chat messages, the langchain LLM handles "translation" from AIMessage and Human messages into OpenAI format.
Then OpenAI API consumes these as separate messages and adds these (or similar) special tokens around the message, similarly as you describe here.

On the other hand, I agree that it would look great from the developer's perspective to just define

'''<prompt:system>
system part of the prompt
'''

'''<propt:instructions>
instructions....
'''

and let some other layers handle it...
... I still think that at some point this will be handled more natively at langchain, but adding an option to add custom handling of this here might be interesting...

BTW... what is the reason here why not to use directly LLAMA2 tags? ( <<SYS>>) the goal here is to have the prompt compatible with chatGPT and LLAMA2 at the same time?

tcaminel-pro · 2023-08-28T10:14:57Z

Yes, I want prompts compatible with ChatGPT and LLAMA2 at the same time. ChatGPT is a market leader, but LLAMA2 is a game changer with all its derivated models coming on, so interoperability is nice to have.

Moreover, they have different behavior. I would like to send the same prompts to multiple LLM to reduce hallucinations or generate more variants but still reproductible with temperature=0, and/or asking each LLM to check the answer of other one.
I would love to do that with langchain-decorators, leveraging its ease to deal with structured results.

Another nice and complementary addition to your library would have to get the chain and kwarg, without running the request to the LLM. Something like:

chain, kwargs = FakeCompanyGenerator().generate(company_business="sells cookies", _mode=RETURN_CHAIN)
chain.run(*kwargs)

That way we could benefit of Chains directly, for exemple to use some recent langchain addition like LangChain Expression Language.

These 2 improvements would be strong added values to langchain-decorators, IMO.

ju-bezdek · 2023-09-20T12:20:10Z

Hey, I was kind of busy, so I couldn't take a proper look at this up until now

So from the beginning, I was thinking that this should be handled similarly to ChatMessage type... Actually I expected this to be handled by langchain, but I checked and it is not, and after reviewing their code, I can see why

Anyway...
Just released v0.2.0, and this is now supported in this way:

You can define a custom prompt blocks:
so similarly to how you can define ChatMessage roles for the prompt:

"""
```<prompt:system>
You are a helpful assistant
```
```<prompt:user>
{question}
```
"""

You can now define your own prompt_block_type and define your own interpretation
for example:

```<prompt:instruction>
Don't forget to be helpful
```

You can control the implementation by prompt_type
Here is a full working example on how to do that: custom_template_block_bulder_llama2.py

You could also build a dynamic builder that would build the template based on a input kwarg parameter
you'd need to add this arg to the function arguments as well as add it to llm_prompt args like this:
@llm_prompt(control_kwargs=[*SPECIAL_KWARGS, "llm_type"])

and then access llm_type in the TemplateBuilder in the kwargs argument

Regarding the possibility to get the chain in order to use the LangChain Expression Language, I'm not sure if this makes sense since it is meant to combine the prompt, template and LLM... this is already handled by decorators in a different manner and this approach isn't quite compatible on the conceptual manner.

However, I admit that there are other benefits of getting the Chain itself. As a matter of fact, I tried to implement it since the early days, but I couldn't figure out how to do it in a meaningful way (so that it would be of any advantage)...
I don't really like the syntax where you'd get out the chain and the kwargs as a tupple, but I've got an idea.
Now you can get a chain, that has already everything preconfigured, all the input args. You still can override as you'd with LlmChain, however you don't need to and you might just call

chain = ask.build_chain(question="Where was Schrödinger's cat locked in?")

 # call without arguments, as they are preconfigured
chain() # call without arguments, as they are preconfigured

 # override any of the standard LlmChain args for the call
chain(inputs={"question":"What is the meaning of life?"}, return_only_outputs=True)

Just be aware, that the chain returns something else than just executing the prompt function.
You can emulate the native behavior by calling
chain.execute()
or
await chain.aexecute()

ju-bezdek closed this as completed Sep 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support LLama-2 prompt style. #5

Support LLama-2 prompt style. #5

tcaminel-pro commented Aug 25, 2023 •

edited

Loading

ju-bezdek commented Aug 28, 2023 •

edited

Loading

tcaminel-pro commented Aug 28, 2023 •

edited

Loading

ju-bezdek commented Sep 20, 2023

Support LLama-2 prompt style. #5

Support LLama-2 prompt style. #5

Comments

tcaminel-pro commented Aug 25, 2023 • edited Loading

ju-bezdek commented Aug 28, 2023 • edited Loading

tcaminel-pro commented Aug 28, 2023 • edited Loading

ju-bezdek commented Sep 20, 2023

tcaminel-pro commented Aug 25, 2023 •

edited

Loading

ju-bezdek commented Aug 28, 2023 •

edited

Loading

tcaminel-pro commented Aug 28, 2023 •

edited

Loading