-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Usage]: Confirm tool calling is not supported and this is the closest thing can be done #7912
Comments
+1 |
Any updates on this topic? |
Supposedly, #8343 solved it. I haven't tried it yet. I'll give it a go in the next release. |
Thank you @summersonnn . I'll give it a try with the latest VLLM version and see if it works or not. Based on this comment, I understand I will have to use the following flags in the serve command:
Is this correct? The |
I can confirm that tool calling works with the previous configuration given and I will provide an example of the request and the answer given by the model: Request
Response
|
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you! |
Hi.
LLM -> Llama-3.1-8B-Instruct
In the vllm docs, it is said that:
For example, when this code is run (from AutoGen docs):
it is supposed to produce the following which actually comprises executing the function: (I'm showing just a part of it):
But when I run it with my local LLM with vllm backend, it does not execute the function, it replies normally instead: (again, just a part of it)
As you can see, local llm respond starts with "<|python_tag|>" sometimes. Actually most of the times. And this is not about AutoGen. I encountered this behaviour without using any 3rd party framework/lib. And even though I tried my best to hide this token by editing some lines in the config json files (special_tokens etc.), I failed. Any solution to this?
My best attempt to integrate the auto tool calling in vLLM is this:
I added a "default function" in the available tools to llama. It is supposed to call this whenever none of the others is appropriate
And here is the heart of the code which does what I want. For now, I don't actually call the function but the response is the function call with the full signature. So, only calling is missing. I just wanted to be sure if this is the best we can do with vLLM right now:
Here is an example output. The very last line is the function call to be made after manipulating the model's response:
Many thanks.
The text was updated successfully, but these errors were encountered: