Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support system prompt natively for Gemini 1.5 #22

Merged
merged 2 commits into from
May 6, 2024

Conversation

MichaelDoyle
Copy link
Member

Gemini 1.5 supports system instructions natively, but we do not utilize it currently. This proposal would fix that.

The main gotcha here, is that Gemini implemented systemInstructions as a new input parameter, rather than leveraging messages with the role=system. Since our GenerateRequest abstraction is currently predicated on this paradigm (role=system) I had to do a bit of legwork. This proposal will:

  • If system instructions are supported, look to see if the first message has role=system
  • Transform it into a message with role=user and pass to the model as systemInstructions
  • Splice the system message off of the history
  • Throw an Error any time a message with role=system appears anywhere else in the request

Alternatively, we could change GenerateRequest to accept systemInstructions directly.

@MichaelDoyle MichaelDoyle changed the title Support system prompt for Gemini 1.5 Support system prompt natively for Gemini 1.5 May 3, 2024
@MichaelDoyle MichaelDoyle marked this pull request as draft May 3, 2024 15:34
Copy link
Collaborator

@mbleigh mbleigh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Food for thought: I intended ModelInfo to be user-facing indicators of what's supported. So a model that has a synthetic system prompt would still support a system prompt as far as ModelInfo is concerned. I'm not sure how to rationalize that with the need to change model factory behavior based on underlying capability tho...

Comment on lines 325 to 332
if (messages[0].role === 'system') {
systemInstruction = toGeminiSystemInstruction(messages[0]);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend just detecting and splicing off the first system message. Something like:

const systemMessage = messages.find(m => m.role === 'system');
if (systemMessage) messages.splice(messages.indexOf(systemMessage), 1);
const systemInstruction = toGeminiSystemInstruction(systemMessage);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this :) but it seemed to negatively impact the trace, as the system message was no longer part of the "input".

Copy link
Collaborator

@mbleigh mbleigh May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about above you do const messages = [...request.messages] to copy the array, then it won't be modified in place and should still show up in the trace.

@MichaelDoyle
Copy link
Member Author

Food for thought: I intended ModelInfo to be user-facing indicators of what's supported. So a model that has a synthetic system prompt would still support a system prompt as far as ModelInfo is concerned. I'm not sure how to rationalize that with the need to change model factory behavior based on underlying capability tho...

Yes, this is a great point. I'll noodle on this some more.

@MichaelDoyle MichaelDoyle force-pushed the gemini-system-instructions branch 2 times, most recently from d820a7f to 46e9642 Compare May 5, 2024 18:19
@MichaelDoyle MichaelDoyle marked this pull request as ready for review May 5, 2024 18:56
@MichaelDoyle MichaelDoyle requested a review from mbleigh May 5, 2024 18:56
@MichaelDoyle
Copy link
Member Author

Updated supports: systemRole to be a user-facing indicator.

@MichaelDoyle MichaelDoyle force-pushed the gemini-system-instructions branch 3 times, most recently from 9282992 to f012c69 Compare May 5, 2024 19:03
@MichaelDoyle MichaelDoyle merged commit 56d6d94 into main May 6, 2024
4 checks passed
@MichaelDoyle MichaelDoyle deleted the gemini-system-instructions branch May 6, 2024 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants