Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

newb, looking for a bit more clarity on how to get rolling #154

Open
bitsofinfo opened this issue Apr 11, 2024 · 5 comments
Open

newb, looking for a bit more clarity on how to get rolling #154

bitsofinfo opened this issue Apr 11, 2024 · 5 comments

Comments

@bitsofinfo
Copy link

bitsofinfo commented Apr 11, 2024

Hi - overall I'm getting my feet wet in the LLM world, came across this project via numerous references and am interested in trying it out. I've reviewed the docs and just need some guidance, they appear to be the same as what is in the readme. Lots of moving parts, different projects being referenced and a bit overwhelming.

  1. I am on a Mac M3, so looks like the vLLM example is a no go for me.

  2. I successfully ran the llama.cpp inference example locally and it produces the function json as expected.

  3. I'd like to try the chatlab example but it won't run. (per the note in the README regarding this) I can't install chatlab==0.16.0, the latest chatlab version yields a import error on Conversation.

So here is what I'm trying to achieve and is not clear to me from the docs.

  1. what should I run to get the functionary model exposed over the openai api interface? I assume something like the llama.cpp example that provides the inference over the functionary model, but long lived running in the background as a process? Perhaps I could just achieve the same by running the functionary model in something like LM Studio's server?

  2. Then in a 2nd process, that is where I'd be running code that would act as the client mediating between a user and the llama.cpp endpoint (such as if I can get chatlab running)?

thank you!

@bitsofinfo bitsofinfo changed the title newbied, looking for a bit more clarity on how to get rolling newb, looking for a bit more clarity on how to get rolling Apr 11, 2024
@bitsofinfo
Copy link
Author

exposing functionary model via LM Studio (via its openai api server) doesn't seem to work, i just get back human responses rather than functions.

in any case I next tried exposing the model via llama.cpp server per the doc:

python3 -m llama_cpp.server \
    --model path/to/functionary/functionary-small-v2.4.Q8_0.gguf \
    --chat_format functionary-v2 \
    --hf_pretrained_model_name_or_path path/to/functionary

then running a modified chatlab example:

import openai
import os
import asyncio
import chatlab
from pydantic import BaseModel
from typing import List, Optional

async def main():

    openai.api_key = "functionary" # We just need to set this something other than None
    os.environ['OPENAI_API_KEY'] = "functionary" # chatlab requires us to set this too
    os.environ['OPENAI_API_BASE'] = "http://localhost:1234/v1" # chatlab requires us to set this too
    openai.base_url = "http://localhost:1234/v1"

    # now provide the function with description
    def get_car_price(car_name: Optional[str] = None):
        """this function is used to get the price of the car given the name
        :param car_name: name of the car to get the price
        """
        car_price = {
            "tang": {"price": "$20000"},
            "song": {"price": "$25000"} 
        }
        for key in car_price:
            if key in car_name.lower():
                return {"price": car_price[key]}
        return {"price": "unknown"}


    class CarPrice(BaseModel):
        car_name: Optional[str]


    chat = chatlab.Chat(model="meetkai/functionary-small-v2.4",base_url="http://localhost:8000/v1")

    # Register our function
    f = chat.register(get_car_price, CarPrice)
    print(f)


    await chat.submit("What is the price of the car named tang?") # submit user prompt
    print(chat.messages)


if __name__ == "__main__":
    asyncio.run(main())

I get in the llama.cpp server stdout:

  File "/path/to/rnd/ai/functionary/functionary.ve/lib/python3.12/site-packages/llama_cpp/llama.py", line 1655, in create_chat_completion
    return handler(
           ^^^^^^^^
  File "/rnd/ai/functionary/functionary.ve/lib/python3.12/site-packages/llama_cpp/llama_chat_format.py", line 1880, in functionary_v1_v2_chat_handler
    assert stream is False  # TODO: support stream mode
    ^^^^^^^^^^^^^^^^^^^^^^
AssertionError
INFO:     ::1:59817 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error

which relates to your note in the doc regarding llama-cpp-python's OpenAI-compatible server does not support streaming for Functionary models yet as of v0.2.50.

not sure where to go from here.

@jeffreymeetkai
Copy link
Collaborator

jeffreymeetkai commented Apr 13, 2024

Hi, thank you for your interest in our model.

  1. LM-Studio is not fully compatible with Functionary in terms of function-calling as mentioned here. Fortunately, there is an OpenAI-compatible server in llama-cpp-python that specifically runs Functionary models which I see that you have tried.
  2. I think the submit method in Chatlab calls the llama-cpp-python server with streaming turned on by default. Unfortunately, streaming is not yet supported currently in the llama-cpp-python integration. This explains the error that you encountered. You can call the submit method with stream=False and it will work. FYI, I'm using chatlab==1.3.0. Not sure if it's easier with latest versions of chatlab. Will check on this soon. Hopefully, your use case doesn't require streaming!

@bitsofinfo
Copy link
Author

thanks for the response. no it doesn't require streaming, i'll try this stream=False option

@bitsofinfo
Copy link
Author

doing that I get back

    await chat.submit("What is the price of the car named tang?", stream=False) # submit user prompt
    print(chat.messages)
{
    "error": {
        "message": "[{\"type\": \"literal_error\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"role\"), \"msg\": "Input should be \"system\"", \"input\": \"assistant\", \"ctx\": {\"expected\": "\"system\""
    }, \"url\": \"https://errors.pydantic.dev/2.6/v/literal_error\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"content\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}, {\"type\": \"literal_error\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"role\"), \"msg\": "Input should be \"user\"", \"input\": \"assistant\", \"ctx\": {\"expected\": "\"user\""
}, \"url\": \"https://errors.pydantic.dev/2.6/v/literal_error\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"content\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"content\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}, {\"type\": \"literal_error\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"role\"), \"msg\": "Input should be \"tool\"", \"input\": \"assistant\", \"ctx\": {\"expected\": "\"tool\""
}, \"url\": \"https://errors.pydantic.dev/2.6/v/literal_error\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"content\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"tool_call_id\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}, {\"type\": \"literal_error\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"role\"), \"msg\": "Input should be \"function\"", \"input\": \"assistant\", \"ctx\": {\"expected\": "\"function\""
}, \"url\": \"https://errors.pydantic.dev/2.6/v/literal_error\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"content\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"name\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}]",
"type": "internal_server_error",
"param": None,
"code": None
}
}

@jeffreymeetkai
Copy link
Collaborator

Can you try with the latest version of llama-cpp-python? I think their developers made some changes previously such that it created this pydantic error. I'm on the latest official version - v0.2.61.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants