Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support LLama-2 prompt style. #5

Closed
tcaminel-pro opened this issue Aug 25, 2023 · 3 comments
Closed

Support LLama-2 prompt style. #5

tcaminel-pro opened this issue Aug 25, 2023 · 3 comments

Comments

@tcaminel-pro
Copy link

tcaminel-pro commented Aug 25, 2023

LLama-2-Chat is becoming a serious alternative to ChatGPT-3.5. However, the prompts must be structured in a special way to make it effective: see https://www.pinecone.io/learn/llama-2/

It could be quite easy I think for langchain-decorators to generate prompts that follow that style, while being compatible with ChatGPT. That would open lot of opportunities...

I did some try with the tags defind as fields in the model, but that could be more generic. It runs on https://deepinfra.com/ .
The JSON parsing fails sometimes, but I guess that could be easy to fix.

from pydantic import BaseModel, Field
from langchain.llms import DeepInfra
from langchain_decorators import llm_prompt, GlobalSettings

llama70 = DeepInfra(model_id="meta-llama/Llama-2-70b-chat-hf")
# llama70.model_kwargs = {"temperature": 0.001}

GlobalSettings.define_settings(default_llm=llama70)
class TheOutputStructureWeExpect(BaseModel):
    name: str = Field(description="The name of the company")
    headline: str = Field(
        description="The description of the company (for landing page)"
    )
    employees: list[str] = Field(
        description="5-8 fake employee names with their positions"
    )

PROMPT_STYLE = "Llama"
class FakeCompanyGenerator(BaseModel):
    B_SYS, E_SYS, B_INST, E_INST = (
        "<s> <<SYS>>\n",
        "\n<</SYS>>\n",
        "[INST]",
        "[/INST]" if PROMPT_STYLE == "Llama" else ("", "", "", ""),
    )

    @llm_prompt()
    def generate(self, company_business: str) -> TheOutputStructureWeExpect:
        """{B_SYS} You are a friendly consultant that only communicates using JSON files. {E_SYS}
        {B_INST}
        Generate a fake company that {company_business}
        Strictly follow the following JSON format instruction. Generate only one example output.
        {FORMAT_INSTRUCTIONS}
        {E_INST}"""
        return

company = FakeCompanyGenerator().generate(company_business="sells cookies")
@ju-bezdek
Copy link
Owner

ju-bezdek commented Aug 28, 2023

Hey, thanks for the suggestion.

I was thinking about it during the weekend. The thing is, that I would assume that this would be handled on a lower level at langchain.

For chat messages, the langchain LLM handles "translation" from AIMessage and Human messages into OpenAI format.
Then OpenAI API consumes these as separate messages and adds these (or similar) special tokens around the message, similarly as you describe here.

On the other hand, I agree that it would look great from the developer's perspective to just define

'''<prompt:system>
system part of the prompt
'''

'''<propt:instructions>
instructions....
'''

and let some other layers handle it...
... I still think that at some point this will be handled more natively at langchain, but adding an option to add custom handling of this here might be interesting...

BTW... what is the reason here why not to use directly LLAMA2 tags? ( <<SYS>>) the goal here is to have the prompt compatible with chatGPT and LLAMA2 at the same time?

@tcaminel-pro
Copy link
Author

tcaminel-pro commented Aug 28, 2023

Yes, I want prompts compatible with ChatGPT and LLAMA2 at the same time. ChatGPT is a market leader, but LLAMA2 is a game changer with all its derivated models coming on, so interoperability is nice to have.

Moreover, they have different behavior. I would like to send the same prompts to multiple LLM to reduce hallucinations or generate more variants but still reproductible with temperature=0, and/or asking each LLM to check the answer of other one.
I would love to do that with langchain-decorators, leveraging its ease to deal with structured results.

Another nice and complementary addition to your library would have to get the chain and kwarg, without running the request to the LLM. Something like:

chain, kwargs = FakeCompanyGenerator().generate(company_business="sells cookies", _mode=RETURN_CHAIN)
chain.run(*kwargs)

That way we could benefit of Chains directly, for exemple to use some recent langchain addition like LangChain Expression Language.

These 2 improvements would be strong added values to langchain-decorators, IMO.

@ju-bezdek
Copy link
Owner

Hey, I was kind of busy, so I couldn't take a proper look at this up until now

So from the beginning, I was thinking that this should be handled similarly to ChatMessage type... Actually I expected this to be handled by langchain, but I checked and it is not, and after reviewing their code, I can see why

Anyway...
Just released v0.2.0, and this is now supported in this way:

You can define a custom prompt blocks:
so similarly to how you can define ChatMessage roles for the prompt:

"""
```<prompt:system>
You are a helpful assistant
```
```<prompt:user>
{question}
```
"""

You can now define your own prompt_block_type and define your own interpretation
for example:

```<prompt:instruction>
Don't forget to be helpful
```

You can control the implementation by prompt_type
Here is a full working example on how to do that: custom_template_block_bulder_llama2.py

You could also build a dynamic builder that would build the template based on a input kwarg parameter
you'd need to add this arg to the function arguments as well as add it to llm_prompt args like this:
@llm_prompt(control_kwargs=[*SPECIAL_KWARGS, "llm_type"])

and then access llm_type in the TemplateBuilder in the kwargs argument

Regarding the possibility to get the chain in order to use the LangChain Expression Language, I'm not sure if this makes sense since it is meant to combine the prompt, template and LLM... this is already handled by decorators in a different manner and this approach isn't quite compatible on the conceptual manner.

However, I admit that there are other benefits of getting the Chain itself. As a matter of fact, I tried to implement it since the early days, but I couldn't figure out how to do it in a meaningful way (so that it would be of any advantage)...
I don't really like the syntax where you'd get out the chain and the kwargs as a tupple, but I've got an idea.
Now you can get a chain, that has already everything preconfigured, all the input args. You still can override as you'd with LlmChain, however you don't need to and you might just call

chain = ask.build_chain(question="Where was Schrödinger's cat locked in?")

 # call without arguments, as they are preconfigured
chain() # call without arguments, as they are preconfigured

 # override any of the standard LlmChain args for the call
chain(inputs={"question":"What is the meaning of life?"}, return_only_outputs=True)

Just be aware, that the chain returns something else than just executing the prompt function.
You can emulate the native behavior by calling
chain.execute()
or
await chain.aexecute()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants