-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated ollama handler to handle case when multiple tokens are return… #3
Conversation
…ed at the same time Instead of decoding the entire payload, split it up based on the newline character and parse each one independently
@ronneldavis By multiple tokens do you mean one streaming response from ollama could actually be multiple responses?
|
Seems like the ollama folks themselves are handling the streaming response this way. |
@zya I face this issue cause my output tokens have "\n" in it, its a code generation model.
You will notice that in a single streaming response, you will see multiple tokens that look like this:
|
I think the Ollama folk are using the |
Handling the case when multiple tokens are sent by ollama at the same time
Instead of decoding the entire payload, split it up based on the newline character and parse each one independently