-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
newb, looking for a bit more clarity on how to get rolling #154
Comments
exposing functionary model via LM Studio (via its openai api server) doesn't seem to work, i just get back human responses rather than functions. in any case I next tried exposing the model via llama.cpp server per the doc:
then running a modified chatlab example:
I get in the llama.cpp server stdout:
which relates to your note in the doc regarding not sure where to go from here. |
Hi, thank you for your interest in our model.
|
thanks for the response. no it doesn't require streaming, i'll try this |
doing that I get back
|
Can you try with the latest version of llama-cpp-python? I think their developers made some changes previously such that it created this pydantic error. I'm on the latest official version - v0.2.61. |
Hi - overall I'm getting my feet wet in the LLM world, came across this project via numerous references and am interested in trying it out. I've reviewed the docs and just need some guidance, they appear to be the same as what is in the readme. Lots of moving parts, different projects being referenced and a bit overwhelming.
I am on a Mac M3, so looks like the vLLM example is a no go for me.
I successfully ran the llama.cpp inference example locally and it produces the function json as expected.
I'd like to try the
chatlab
example but it won't run. (per the note in the README regarding this) I can't installchatlab==0.16.0
, the latest chatlab version yields a import error onConversation
.So here is what I'm trying to achieve and is not clear to me from the docs.
what should I run to get the functionary model exposed over the openai api interface? I assume something like the llama.cpp example that provides the inference over the functionary model, but long lived running in the background as a process? Perhaps I could just achieve the same by running the functionary model in something like LM Studio's server?
Then in a 2nd process, that is where I'd be running code that would act as the client mediating between a user and the llama.cpp endpoint (such as if I can get chatlab running)?
thank you!
The text was updated successfully, but these errors were encountered: