Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [Lang] Cohere: Streaming Workflow is not working #200

Closed
roma-glushko opened this issue Apr 11, 2024 · 1 comment · Fixed by #201
Closed

🐛 [Lang] Cohere: Streaming Workflow is not working #200

roma-glushko opened this issue Apr 11, 2024 · 1 comment · Fixed by #201
Assignees
Labels

Comments

@roma-glushko
Copy link
Member

I have tried to test Cohere streaming workflow and faced error like:
Screenshot 2024-04-11 at 22 12 12

After looking into Cohere documentation and some other LLM clients out there, it seems like Cohere is not using SSE protocol to stream chat chunks, they send serialised JSON chunks per stream line:

https://github.com/BerriAI/litellm/blob/main/litellm/llms/cohere.py#L178-L191

We need to fix this issue to make the stream workflow working.

@roma-glushko roma-glushko self-assigned this Apr 11, 2024
@roma-glushko
Copy link
Member Author

@mkrueger12 let's talk about this issue in discord

roma-glushko added a commit that referenced this issue Apr 11, 2024
@roma-glushko roma-glushko added this to the Glide: Public Preview milestone Apr 11, 2024
@roma-glushko roma-glushko changed the title [Lang] Cohere: Streaming Workflow is not working 🐛 [Lang] Cohere: Streaming Workflow is not working Apr 11, 2024
roma-glushko added a commit that referenced this issue Apr 15, 2024
roma-glushko added a commit that referenced this issue Apr 15, 2024
…re chat streams correctly (#201)

- implementing a custom stream reader to correctly handle Cohere streams
- Start handling the stream-start event to propagate generationID to all following chunks
roma-glushko added a commit that referenced this issue Apr 16, 2024
Final major improvements to streaming chat workflow. Fixed issues with Cohere streaming chat. 
Expanded and revisited Cohere params in config.

## Changelog

### Added

- 🔧 #195 #196: Set router ctx in stream chunks & handle end of stream in case of some errors (@roma-glushko)
- 🐛🔧 #197: Handle max_tokens & content_filtered finish reasons across OpenAI, Azure and Cohere (@roma-glushko)

### Changed

- 🔧 💥 #198: Expose more Cohere params & fixing validation of provider params in config (breaking change) (@roma-glushko)
- 🔧 #186: Rendering Durations in a human-friendly way (@roma-glushko)

### Fixed

- 🐛 #209: Embed Swagger specs into binary to fix panics caused by missing swagger.yaml file (@roma-glushko)
- 🐛 #200: Implemented a custom json per line stream reader to read Cohere chat streams correctly (@roma-glushko)
roma-glushko added a commit that referenced this issue Apr 16, 2024
## Summary

✨ Bringing support for streaming chat in Glide (integrated with OpenAI, Azure OpenAI and Cohere)
✨ Started handling 401 errors to mark models as premaritally unavailable (e.g. when API key was not correct)
🐛 Fixing the panic related to swagger.yaml file
🐛 Fixing Anthropic chat workflow by passing API key correctly
🔧 Improved Cohere param config and validation

## Changelog

### Added

- ✨Streaming Chat Workflow #149 #163 #161 (@roma-glushko)
- ✨Streaming Support for Azure OpenAI #173 (@mkrueger12)
- ✨Cohere Streaming Chat Support #171 (@mkrueger12)
- ✨Start counting token usage in Anthropic Chat #183 (@roma-glushko)
- ✨Handle unauthorized error in health tracker #170 (@roma-glushko)
- 🔧 #195 #196: Set router ctx in stream chunks & handle end of stream in case of some errors (@roma-glushko)
- 🐛🔧 #197: Handle max_tokens & content_filtered finish reasons across OpenAI, Azure and Cohere (@roma-glushko)

## Changed

- 🔧 💥 #198: Expose more Cohere params & fixing validation of provider params in config (breaking change) (@roma-glushko)
- 🔧 #186: Rendering Durations in a human-friendly way (@roma-glushko)

### Fixed

- 🐛 Fix Anthropic API key header #183 (@roma-glushko)
- 🐛 #209: Embed Swagger specs into binary to fix panics caused by missing swagger.yaml file (@roma-glushko)
- 🐛 #200: Implemented a custom json per line stream reader to read Cohere chat streams correctly (@roma-glushko)

### Security

-  🔓 Update crypto lib, golang, fiber #148 (@roma-glushko)

### Miscellaneous

-  🐛 Update README.md to fix helm chart location #167 (@arjunnair22)
- 🔧 Updated .go-version (@roma-glushko)
-  ✅ Covered the telemetry by tests #146 (@roma-glushko)
- 📝 Separate and list all supported capabilities per provider #190 (@roma-glushko)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
1 participant