-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes token logging in callbacks when streaming=True is used. #241
Conversation
@wuodar @supreetkt |
Btw, this check also seems to be missing from SagemakerEndpoint class. So there might be empty tokens there as well. |
@supreetkt I will look separately into the responses to see if we are yielding extra chunks that shouldn't be exposed. As far as this PR is concerned, I believe this fixes the raised issue. Let me know if you have any other suggestions. |
@ihmaws raised a similar concern this morning, and when I took a look at the generated chunks, they only contained stop reason and for our use case, having additional callback invocation for the websocket meant additional calls which add to additional costs. I checked one partner implementation and they seemed to be checking only to ensure that the output is a string. Here is a sample empty generated chunk:
All this information is already packaged as a part of the final response:
At the end its your call based on your experience with more callback options. I can just handle the same logic on my end within our custom implementation of the I had one more concern: I tried using |
Can you provide a link to that implementation.
Yes, I think this can be handled easily by the callbacks based on their use case.
I will tackle this in a separate PR, but in general |
Here is the link. This logic allows for empty string. Also, if you're tackling the converse API issue in a separate PR, this LGTM. |
Fixes #240
Fixes #217
Code to verify
Output