-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding support for Llama3 models in BedrockChat #32
Conversation
@fedor-intercom |
@3coins Should be it? just added a llama specific integration test. |
@@ -47,6 +47,42 @@ def convert_messages_to_prompt_llama(messages: List[BaseMessage]) -> str: | |||
) | |||
|
|||
|
|||
def _convert_one_message_to_text_llama3(message: BaseMessage) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose one small change here, plus extra logic to support word in mouthing the assistant turn, for things like agent scratchpads etc living in the assistant turn,
def _convert_one_message_to_text_llama3(message: BaseMessage, lastMessage: bool) -> str:
if isinstance(message, ChatMessage):
message_text = f"<|start_header_id|>{message.role.capitalize()}<|end_header_id|>{message}<|eot_id|>"
elif isinstance(message, HumanMessage):
message_text = f"<|start_header_id|>user<|end_header_id|>{message.content}<|eot_id|>"
elif isinstance(message, AIMessage):
message_text = f"<|start_header_id|>assistant<|end_header_id|>{message.content}"
if not lastMessage:
message_text += "<|eot_id|>"
elif isinstance(message, SystemMessage):
message_text = f"<|start_header_id|>system<|end_header_id|>{message.content}"
else:
raise ValueError(f"Got unknown type {message}")
return message_text
def convert_messages_to_prompt_llama3(messages: List[BaseMessage]) -> str:
"""Convert a list of messages to a prompt for llama3."""
return "<|begin_of_text|>" + "".join(
[_convert_one_message_to_text_llama3(message, i == len(messages) - 1)
for i, message in enumerate(messages)]
) + "<|start_header_id|>assistant<|end_header_id|>"
This would allow stuff along these lines.
messages = [HumanMessage(content="list 5 colors"),AIMessage(content="No, I don't want to!")]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not totally sure I understand the use case.
Shall we do this in a separate PR with a more illustrative example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's say we want to write part of our assistant turn to align it to do something. This allows us to preemptively fill the assistant turn, without having added an EOD token such that the generation will start and be disjoint from the partially filled assistant turn.
<|start_header_id|>user<|end_header_id|>Can you generate me some xml data<|eot_id|>
<|start_header_id|>user<|end_header_id|>Can you generate me some xml data<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>Yes here's some xml content, <xml>
Right now this implementation would do this for this use case,
<|start_header_id|>user<|end_header_id|>Can you generate me some xml data<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>Yes here's some xml content, <xml> <|eot_id|>
Causing it to kick to another assistant turn where it's not already aligned to doing what we've prefilled it to do.
Yes, this can go in another PR, this is an edge use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonathancaevans
Thanks for explaining your use case, would you mind opening another issue for this, or even better if you can open a PR to make this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! 🚀
Issue described here:
#31
This is a proposal that will allow to query Llama3 models reliably. The main issue was related to the new tokens required Llama3. If not using them, it will return an empty string for long prompts.