-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenedAI API Issue #3910
Comments
I meant 2080 TI not 3080 |
Update i fixed it by editing the config.yaml and adding truncation_length: length to the model i was trying to use. Now i just want to know why every response gets shorter and shorter everytime almost. |
The truncation length problem is a known issue, see: #3153 |
Yes this was how I fixed it, I was wondering now how come it seems the response gets shorter everytime? |
@matatonic do you see any light at the end of the tunnel for the truncation length fix? Just curious, not pressing. I'm helping out with something where our main "go to" for local llm generation is textgenwebui's openai extension (thanks so much for making it!) |
I do, it will. I had a fix previously (which no longer works). I am currently on holiday and will spend more active time on this near the end of Sept. |
Thanks!! Really appreciate all your work on this, its brilliant. I've tested a lot of solutions and the textgenwebui + openai extension is by far the easiest to get going. Have a good holiday! |
Can you explain this in some more detail? Maybe with an example and using OPENEDAI_DEBUG=1? I'm not sure what this could mean. |
That's my bad, it was a problem with my code and how I was iteratively calling it. Now I'm back to the truncation issue lol |
It took me a bit to figure out that the openai api extension ignored most of the parameters set in the config files that both the webui and the regular api use. A workaround for the max context problem is to hardcode the value in completions.py for the openai extension as mentioned here:
I also had to hardcode other model parameters in the completions.py. |
@matatonic is there any way to make the opened ai api publicly available, i.e. to where I could use it in React Native code? |
Did you guys solved the multi user asynchronous feature in the API? if Yes, how ? |
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment. |
Describe the bug
Everytime I generate something too long it says this:
raise self.handle_error_response(
openai.error.InvalidRequestError: This model maximum context length is 2048 tokens. However, your messages resulted in over 2054 tokens.
How can I set the limit of the tokens?
Is there an existing issue for this?
Reproduction
All i did was make multiple calls to the api using different methods.
Screenshot
No response
Logs
System Info
The text was updated successfully, but these errors were encountered: