Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: provide alternative local/offline solution #56

Closed
Robitx opened this issue Nov 12, 2023 · 11 comments
Closed

feat: provide alternative local/offline solution #56

Robitx opened this issue Nov 12, 2023 · 11 comments

Comments

@Robitx
Copy link
Owner

Robitx commented Nov 12, 2023

Ideally via dockers providing callable API.

@tarruda
Copy link
Contributor

tarruda commented Dec 2, 2023

At least for the "chat" function, this already works. I've tested the plugin with OpenChat 3.5 running with llama-cpp-python, which has an OpenAI compatible web server. Here's my config:

    local default_config = require('gp.config')
    local agents = { unpack(default_config.agents), {
        name = "OpenChat3-5",
        chat = true,
        command = false,
        model = { model = 'openchat_3.5.Q6_K', temperature = '0.5', top_p = 1 },
        system_prompt = default_config.agents[1].system_prompt
      }
    }
    gp.setup {
      openai_api_key = 'dummy',
      openai_api_endpoint = 'http://127.0.0.1:8000/v1/chat/completions',
      cmd_prefix = 'Gp',
      agents = agents,
      chat_topic_gen_model = 'openchat_3.5.Q6_K'
    }

Note that the model parameter is ignored by llama-cpp-python.

I'm not familiar with all the plugin features, but I will be sure to try the "command" later.

@Robitx
Copy link
Owner Author

Robitx commented Dec 18, 2023

@johnallen3d
Copy link

johnallen3d commented Jan 31, 2024

EDIT: this was user error, please disregard.

I'm trying this out with LM Studio and it almost works. LM Studio is throwing an error because the content value for the assistant is empty.

[2024-01-30 20:43:15.630] [INFO] Received POST request to /v1/chat/completions with body: {
  "model": "gpt-3.5-turbo-16k",
  "stream": true,
  "messages": [
    {
      "role": "system",
      "content": "You are a general AI assistant.\n\nThe user provided the additional info about how they would like you to respond:\n\n- If you're unsure don't guess and say you don't know instead.\n- Ask question if you need clarification to provide better answer.\n- Think deeply and carefully from first principles step by step.\n- Zoom out first to see the big picture and then zoom in to details.\n- Use Socratic method to improve your thinking and coding skills.\n- Don't elide any code from your output if the answer requires coding.\n- Take a deep breath; You've got this!"
    },
    {
      "role": "user",
      "content": "Tell me a joke"
    },
    {
      "role": "assistant",
      "content": ""
    },
    {
      "role": "user",
      "content": "Summarize the topic of our conversation above in two or three words. Respond only with those words."
    }
  ]
}
[2024-01-30 20:43:15.630] [ERROR] [Server Error] {"title":"'messages' array must only contain objects with a 'content' field that is not empty"}

I'm not sure that I have any influence over this in terms of configuration.

@Robitx
Copy link
Owner Author

Robitx commented Jan 31, 2024

@johnallen3d Hey, this seems weird, I've just tried lmstudio (on linux) against the main branch by just changing openai_api_endpoint = "http://localhost:1234/v1/chat/completions", and it works fine.

If the chat # topic header is not set, the plugin makes two calls. The first one for providing the answer to a user message and a second one generating topic name. The call you've provided seems to be a the one for generating chat # topic header, while the empty content from assistant should be filled during the first call to the endpoint.

Could you provide more info? Log for the first call if visible, result of :GpInspectPlugin after the failure, OS you're using.

I'm slowly cooking support for multiple providers #93 currently working with openAI compatible APIs, Copilot endpoint and Ollama - just added the LMStudio and seems to work in that branch as well.

@johnallen3d
Copy link

Ok, @Robitx, I tried this again and it's working fine. 🤦‍♂️ Sorry for the false report, I'll edit my comment to clarify that this was my mistake.

Thanks for your work on this! Looking forward to #93!

@johnallen3d
Copy link

Another potential backend for you to consider @Robitx.

https://github.com/TabbyML/tabby

It looks like the have or are working towards an OpenAI compatible API. The nice thing here is that you could use tabby serve to serve up model(s) for inline tabby completions (ala Copilot) and for gp.nvim style chats etc.

@helins
Copy link

helins commented Feb 15, 2024

I have tried with Ollama and it works perfectly. It seems that all that is really needed to turn this plugin into a multi client is the ability to overwrite the default api key and endpoint on an agent per agent basis. It would make it even more awesome than it already is.

Right now it is possible to write a hook that would do that but it wouldn't play nicely with the agent commands. One would have to effectively wrap them to do the key + endpoint change as well as the actual agent change.

@bmikaili
Copy link

So if I was using Ollama, I'd just set the openai api endpoint to something like http://localhost:11434/api/generate, right?

@joshmedeski
Copy link

@Robitx would would need to be done to offer any OpenAI compatible alternatives to be supported? I'm guessing exposing a open_ai_url on the agent level (defaulting to OpenAI's API) so you could do it on a per agent basis (even mix solutions if you want).

I'd be happy to help, I'm really enjoying using this plugin and would love to experiment with open-source models!

@teto
Copy link
Collaborator

teto commented Mar 30, 2024

@joshmedeski it's being done at #93

@primeapple
Copy link

Feel free to close, it's merged.

@Robitx Robitx closed this as completed Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants