Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to lmstudio to fix JSON decode error #208

Merged
merged 5 commits into from
Oct 31, 2023

Conversation

raisindetre
Copy link
Contributor

Hi @cpacker . I've made some tweaks to the LM Studio settings.py and api.py code which (for me at least) have resolved the issues using LM Studio as a back-end.

  • Changed the endpoint to use http://localhost:1234/v1/chat/completions
  • Set stream to false per LMStudio curl API example as think it might provide a performance gain.
  • Rewrapped the prompt JSON object within message object for compatability with endpoint and updated reference to response text accordingly.
  • Set the context_overflow_policy to option 2 (rolling context window)

In local testing I'm getting expected behavior in chats and much improved performance. Might be worth checking the code in Windows for compatability (I don't have access at the moment).

@@ -9,5 +9,8 @@
# '\n#',
# '\n\n\n',
],
"max_tokens": 500,
"max_tokens": 3072,
"lmstudio": {"context_overflow_policy": 2},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we shouldn't really be using any context overflow policies, since MemGPT has its own way of handling this and this might conflict. But if this works as a bandaid fix for some other bug wrt open LLM integration I'm OK with leaving it for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok. Well it may work ok without it. As you say, I think its the endpoint change that's doing the heavy lifting for... reasons lol.

…ack with recognizable error message) + add backwards compat option to use completions endpoint
@cpacker cpacker merged commit a048a33 into letta-ai:main Oct 31, 2023
1 check passed
mattzh72 pushed a commit that referenced this pull request Oct 9, 2024
* Changes to lmstudio to fix JSON decode error

* black formatting

* properly handle context overflow error (propogate exception up the stack with recognizable error message) + add backwards compat option to use completions endpoint

* set max tokens to 8k, comment out the overflow policy (use memgpt's overflow policy)

* 8k not 3k

---------

Co-authored-by: Matt Poff <[email protected]>
Co-authored-by: cpacker <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants