-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added wrapper to allow word-by-word chat stream responses #107
Conversation
Thanks a lot @ssddanbrown! I pushed a commit to fix the linting issues and make it more type-safe. As for the usage data: it's weird that this is not returned in the final message. I guess we should add a toggle for streaming until OpenAI fixes that in their API. |
Another option would be to add https://github.com/dqbd/tiktoken/tree/main/js and tokenize the responses ourselves, so we get a (rough?) idea of the # of tokens used in the streaming mode. But ideally OpenAI would simply add usage to the streaming responses too. |
I added some more changes, there are still two issues though:
|
Thanks for your contribution! This feature is now available through the merged PR #152 |
@all-contributors please add @ssddanbrown for code |
@ssddanbrown already contributed before to code |
These changes update the code around the
/v1/chat/completions
API usage to integrate thefetchEventSource
code that was previously commented out (and trialled as per the comment here).The
sendRequest
function is updated to handle both streaming and non-streaming requests.A
ChatCompletionResponse
wrapper class has been added to standardise the response and handling of data back from the two methods of API access, to cleanly provide back a message in the application storage format.The
submitForm
function has been updated to use streaming, the stored message is updated as updates come from the API, although we still wait for the full response before TTS output (I thought line-by-line TTS could be a bad experience).The
suggestName
has been updated to use the new handling but remains non-streaming.Related to #70
Considerations
While streaming we don't get back usage data. That's taken into account, so this doesn't break anything, but the usage won't show on these responses.
Due to that, and since whole-response-at-once may be preferable to some, this should probably be an option of some kind but I was not sure how that should be presented, so I didn't go ahead and implement that. If such an option was added, switching between async/sync toggling of messages can easily be done via the boolean passed as a second parameter to
sendRequest
withinsubmitForm
. The rest of the code in the function should work for both request types.This updates the
addMessage
storage function to return the index of the added message so we have a reference for updating.I developed & tested this on Firefox/Linux/Fedora, with some quick testing on chromium too.
No pressure to merge this PR or use this code, I know this wasn't asked for at all so if you don't want to use this, or want to go in a different direction, that's totally fine and I respect that. I'm just happy to have explored solutions to the issue and learn how the streaming API works.
As my last PR, I'm not really familiar with Svelte or Typescript but I have attempted to follow conventions as much as possible. Just shout if you you have any questions or if you want me to make any changes.