-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No API Documentation #265
Comments
We are using upstream llama.cpp server by default: https://github.com/ggerganov/llama.cpp/tree/master/examples/server and it does say on that page: looking for feedback and contributors But there's also a "--runtime" flag, where the intent is switchable servers, vllm is one that we will support in future, but it is only implemented in --nocontainer mode right now so one must set up vllm themselves |
Good to know! Thank you for sharing that. Be really helpful to have a link
to the API documentation on the serve page or mentioned in the readme or
something because that was not at all obvious that I need to go look for
llama.cpp rest documentation when I thought it would be just like the
ollama API haha.
Thank you for straightening me out on this one, you can close this issue. I
really appreciate it
|
Care to open a PR to make this point in the README.md and potentially in the ramalama-serve.1.md file. |
Finally had a free minute to fork and PR - sorry it took so long. PR is up at: #274 |
Thanks for merging the PR! I'll close this issue now. Cheers! |
This was billed as "ollama compatible", but when I run
ramallama serve -p 11434 llama3.2
- my client code that works with ollama does NOT work (posting to/api/chat
returns 404, and I see the POST hit ramallama in the console as well)Where's API documentation for the actual API served by ramallama? 🙏
The text was updated successfully, but these errors were encountered: