generated from langchain-ai/integration-repo-template
-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming responses for Mistral models are getting truncated #215
Labels
Comments
3coins
pushed a commit
that referenced
this issue
Oct 2, 2024
## Move yield of metrics chunk after generation chunk - when using mistral and streaming is enabled,the final chunk includes a stop_reason. There is nothing to say this final chunk doesn't also include some generated text. The existing implementation would result in that final chunk never getting sent back - this update moves the yield of the metrics chunk after the generation chunk - also included a change to include invocation metrics for cohere models Closes #215
ihmaws
added a commit
to ihmaws/langchain-aws
that referenced
this issue
Oct 2, 2024
- when using mistral and streaming is enabled,the final chunk includes a stop_reason. There is nothing to say this final chunk doesn't also include some generated text. The existing implementation would result in that final chunk never getting sent back - this update moves the yield of the metrics chunk after the generation chunk - also included a change to include invocation metrics for cohere models Closes langchain-ai#215 (cherry picked from commit e2c2f7c)
Thanks @3coins! I have a backport to v0.1.18 ready (untested). I can create a PR once a branch to base off is ready: https://github.com/langchain-ai/langchain-aws/compare/v0.1.18...ihmaws:langchain-aws:dev/ihm/fix-mistral-streaming-v0.1.18?expand=1 |
Backport into v0.1 here: #222 |
3coins
pushed a commit
that referenced
this issue
Oct 4, 2024
_backport of fix into v0.1 branch_ Move yield of metrics chunk after generation chunk (#216) - when using mistral and streaming is enabled,the final chunk includes a stop_reason. There is nothing to say this final chunk doesn't also include some generated text. The existing implementation would result in that final chunk never getting sent back - this update moves the yield of the metrics chunk after the generation chunk - also included a change to include invocation metrics for cohere models Closes #215 (cherry picked from commit e2c2f7c)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
When using any of the Mistral models in streaming mode, the last token will get truncated if a stop sequence is added.
Reproducible Sample
Example Truncated Response (Input = "Hello, how are you?")
I'm an AI, I don't have feelin
The text was updated successfully, but these errors were encountered: