-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server Missing OpenAI API Support? #24
Comments
I cherry-picked OpenAI compatibility yesterday in 401dd08. It hasn't been incorporated into a release yet. I'll update this issue when the next release goes out. The llamafiles on Hugging Face will be updated too. |
There a will be new Server binaries? or we can use with already downloaded ones like mistral-7b-instruct-v0.1-Q4_K_M-server.llamafile. It will be great if we won't need to re-download the whole 4GB file. |
I've just published a llamafile 0.2 release https://github.com/Mozilla-Ocho/llamafile/releases/tag/0.2 The downloads on Hugging Face will be updated in a couple hours.
You don't have to redownload. Here's what you can try:
So it takes a bit more effort than redownloading. But it's a great option if you don't have gigabit Internet. |
OK I've uploaded all the new .llamafiles to Hugging Face, for anyone who'd rather just re-download.
Enjoy! |
@jart thanks, I followed the instructions you provided and got a v0.2 llamafile server binary. Now when I start the server (on Mac M1) then try the curl command from llama.cpp/server/README.md the server crashes consistently with this error llama.cpp/server/json.h:21313: assert(it != m_value.object->end()) failed (cosmoaddr2line /Applications/HOME/Tools/llamafile/llamafile-server-0.2 1000000fe3c 1000001547c 100000162e8 10000042748 1000004ffdc 10000050cb0 1000005124c 100000172dc 1000001b370 10000181e78 1000019d3d0)
[1] 34103 abort ./llamafile-server-0.2 |
First of all, @jart , thank you!!! We are getting close:
But as @dzlab mentions, there is an assertion failure during
llamafile/llama.cpp/server/json.h Lines 21305 to 21318 in 73ee0b1
|
This one is working for me: https://huggingface.co/jartine/mistral-7b.llamafile/blob/main/mistral-7b-instruct-v0.1-Q4_K_M-server.llamafile I'm using https://github.com/simonw/llm to connect to it, so not sure of the exact requests it's making.
|
The request reported in the issue seems to work too:
|
@dave1010 glad to hear it's working for you! @jasonacox could you post a new issue sharing the curl command you used that caused the assertion to fail? |
For completeness in case it helps, this curl command works fine for me too: llama.cpp/server/README.md
The server logs:
Some debugging info in case it's helpful:
|
@dave1010 Thank you! This helped me narrow in on the issue. I am able to get this model to run with all the API curl examples with no issue on my Mac (M2). The assertion error only shows up on my Linux Ubuntu 22.04 box (using CPU only and with a GTX 3090 GPU).
Will do! I'll open it up focused on Linux. |
#412 now merged in which will give you the option of using This is done simply by calling Usage Example / Expected Console Output
|
The server presents the UI but seems to be missing the APIs?
The example test:
Results in a 404 error:
The text was updated successfully, but these errors were encountered: