Improve local LLM workflow (no more environment variables) #419

cpacker · 2023-11-10T20:52:17Z

Current setup

If a user wants to run dolphin on LM studio with the airoboros wrapper:

export OPENAI_API_BASE=http://127.0.0.1:1234
export BACKEND_TYPE=lmstudio
memgpt run --model airoboros_xxx

Config (when using a local model)

model is "local", or can be "airoboros_xxx" in which case model == wrapper
model_endpoint stores the IP from OPENAI_API_BASE

[defaults]
model = local
model_endpoint = http://localhost:1234

Proposed setup (with `memgpt run`)

User does not specify and ENV variables, it's all in config
Add a --wrapper arg and config variable

If a user wants to run dolphin on LM studio with the airoboros wrapper:

memgpt run --wrapper airoboros_xxx --endpoint http://localhost:1234 --endpoint_type lmstudio

For almost all backends, it's OK for the model to be unspecified, because what model is running is determined by the backend. The only exception to this is Ollama, which requires you to pass the model name in the POST request. This is already a special case in our documentation: https://memgpt.readthedocs.io/en/latest/ollama/ (currently, we ask the user to set an additional environment variable).

Special Ollama case:

memgpt run --model dolphin_xxx --wrapper airoboros_xxx  --endpoint http://localhost:11434 --endpoint_type ollama

Proposed setup (with `memgpt configure`, then `memgpt run`)

If the user says no to OpenAI, no to Azure, then:
- Ask for their endpoint type (lmstudio, ollama, etc)
- Ask for their endpoint IP
  - We should do input checking / sanitation on the IP they provide (http prefix? hanging /v1/?)
- Ask what prompt formatter / wrapper they want to use
  - IMO I think we should hide this, make it default to the default, but can override with memgpt run --wrapper

Config (when using a local model)

[defaults]
model = optional for non-Ollama (default None), for Ollama this is the real model name (eg dolphin-2.2.1-mistral7b)
model_endpoint = http://localhost:1234
model_endpoint_type = lmstudio

lmstudio:

[defaults]
model = None
wrapper = None
model_endpoint = http://localhost:1234
model_endpoint_type = lmstudio

ollama:

[defaults]
model = dolphin-2.2.1-mistral7b
wrapper = None
model_endpoint = http://localhost:11434
model_endpoint_type = ollama

Special case where the user wants to use OpenAI, but swap the endpoint to a proxy

export OPENAI_API_BASE="<proxy_address>"
memgpt run

Config

We do NOT set model_endpoint to this proxy address, instead let openai-python handle this for us (on our end we act like nothing changed, it's just openai):

[defaults]
model = gpt4 / ...
model_endpoint = n/a
model_endpoint_type = n/a

The text was updated successfully, but these errors were encountered:

cpacker added the enhancement New feature or request label Nov 10, 2023

cpacker assigned sarahwooders Nov 10, 2023

cpacker changed the title ~~Improve local LLM workflow~~ Improve local LLM workflow (no more environment variables) Nov 10, 2023

sarahwooders mentioned this issue Nov 10, 2023

Refactor config + determine LLM via config.model_endpoint_type #422

Merged

12 tasks

cpacker closed this as completed in #422 Nov 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve local LLM workflow (no more environment variables) #419

Improve local LLM workflow (no more environment variables) #419

cpacker commented Nov 10, 2023 •

edited

Loading

Improve local LLM workflow (no more environment variables) #419

Improve local LLM workflow (no more environment variables) #419

Comments

cpacker commented Nov 10, 2023 • edited Loading

Current setup

Config (when using a local model)

Proposed setup (with memgpt run)

Proposed setup (with memgpt configure, then memgpt run)

Config (when using a local model)

Special case where the user wants to use OpenAI, but swap the endpoint to a proxy

Config

cpacker commented Nov 10, 2023 •

edited

Loading

Proposed setup (with `memgpt run`)

Proposed setup (with `memgpt configure`, then `memgpt run`)