Tabby VSCode Extension: Autostart Tabby Server #624

matthiasgeihs · 2023-10-24T14:48:10Z

Context
I use Tabby VSCode extension with a local Tabby server.
Currently, when I start VSCode and the Tabby server is not running, it reminds me of that through the yellow indicated extension icon in the status bar.
In this case, I open a terminal, and start the Tabby server manually, and then the extension is happy and works as expected.
When I close VSCode and no longer need the server, I go to the Terminal window and shut down the Tabby server, manually.

What I would like:
It would be nice if the VSCode extension would start the Tabby Server automatically, if it detects that the server is not running.
Additionally, when I close VSCode, it would shut down the server, if there are no other applications running that rely on it.

Please reply with a 👍 if you want this feature.

matthiasgeihs · 2023-10-24T14:52:05Z

Implementation idea
An implementation of the above functionality could be realized as follows.

Create a background service that runs at all times with minimal resource requirements. The service receives all Tabby requests. If no Tabby Inference Server is running, the service boots up the server. The service checks periodically, if there have been any inference requests within a specified time interval. If no activity happened, it shuts down the inference server and releases its resources.

wsxiaoys · 2023-10-24T22:55:30Z

I believe this problem is specific to Tabby's local deployment scenario. For remote deployment, it is automatically managed by cloud vendors through auto-scaling.

A potential solution to this issue could involve a straightforward Python script utilizing asgi_proxy. The script would create a new Tabby process whenever a request is made. After a certain period of inactivity, such as half an hour without any requests, the script would terminate the process.

This script could be deployed as a system daemon or in a similar manner.

wsxiaoys · 2023-10-25T00:34:16Z

added a quick and basic implementation at https://github.com/TabbyML/tabby/pull/630/files.

matthiasgeihs · 2023-10-25T22:50:37Z

added a quick and basic implementation at https://github.com/TabbyML/tabby/pull/630/files.

cool, works for me. added a few more options and changed the default startup behavior. (does not start the server until incoming inference request.)
add-tabby-supervisor...matthiasgeihs:tabby:add-tabby-supervisor

itlackey · 2023-10-28T01:19:39Z

This I a cool idea!
I use specific switches to launch the Tabby container. It would be great if I could specify the startup command or a path to a docker compose file in the Tabby configuration file maybe.

wsxiaoys · 2023-10-28T03:47:26Z

This I a cool idea! I use specific switches to launch the Tabby container. It would be great if I could specify the startup command or a path to a docker compose file in the Tabby configuration file maybe.

Should be something easy to hack around https://github.com/TabbyML/tabby/blob/main/experimental/supervisor/app.py - by replacing the startup / stop command to docker-compose up and docker-compose down.

limdingwen · 2024-01-27T16:01:02Z

Is it possible or desirable to bundle the Tabby server into the VSCode extension for simple local usage?

wsxiaoys · 2024-01-28T05:55:25Z

Is it possible or desirable to bundle the Tabby server into the VSCode extension for simple local usage?

The only platform where bundling makes sense is probably the Apple M-series. However, given how easily one can install Tabby with homebrew, I feel it doesn't add value to bundle it.

bubundas17 · 2024-07-15T17:57:35Z

Cancelled my copilot subscription and using tabby full time.
Its really usefull feature.
ollama does it by default very well.
I do not code for the whole day. when playing games I need to manually stop the tabby docker container. It will be very helpfull if it automatically offloads models when not in use like ollama.

Or if possible can we offload the model interface to Ollama API?

wsxiaoys · 2024-07-16T05:46:36Z

Or if possible can we offload the model interface to Ollama API?

Yes - this has been supported since 0.12, see https://tabby.tabbyml.com/docs/administration/model/#ollama for a configuration example

bubundas17 · 2024-07-16T14:36:19Z

Or if possible can we offload the model interface to Ollama API?

Yes - this has been supported since 0.12, see https://tabby.tabbyml.com/docs/administration/model/#ollama for a configuration example

Chat still not support OLLAMA http api

my config:

[model.completion.http]
kind = "ollama/completion"
model_name = "codestral:22b-v0.1-q6_K"
api_endpoint = "http://10.66.66.3:11434"
prompt_template = "[SUFFIX]{suffix}[PREFIX]{prefix}"  # Example prompt template for CodeLlama model series.

[model.chat.http]
kind = "ollama/completion"
model_name = "codestral:22b-v0.1-q6_K"
api_endpoint = "http://10.66.66.3:11434"
#api_key = "secret-api-key"

wsxiaoys · 2024-07-16T14:55:35Z

Have you tried https://tabby.tabbyml.com/docs/administration/model/#openaichat ?

bubundas17 · 2024-07-16T17:11:32Z

Have you tried https://tabby.tabbyml.com/docs/administration/model/#openaichat ?

Is the api same for openai and local ollama?
Let me ckeck real quick.

bubundas17 · 2024-07-16T17:43:40Z

Tabby started but the chat is not working with ollama my current config is:

[model.completion.http]
kind = "ollama/completion"
model_name = "codestral:22b-v0.1-q6_K"
api_endpoint = "http://10.66.66.3:11434"
prompt_template = "[SUFFIX]{suffix}[PREFIX]{prefix}"  # Example prompt template for CodeLlama model series.

[model.chat.http]
kind = "openai/chat"
model_name = "codestral:22b-v0.1-q6_K"
api_endpoint = "http://10.66.66.3:11434"

Krakonos · 2024-10-25T18:52:38Z

Hi!

I have a similar request and the proposed solution using a proxy script isn't really a good solution (not for me, not for anyone). The main reason is it's external to the tool and it depends on the deployment method. I have no idea how would I deploy that in kubernetes (and it's surprisingly hard to autoscale to 0 there). In docker/docker-compose this essentially means the script has to run with root privileges or very close to it (since it has to invoke the docker compose up at least).

Offloading to ollama seems like a good idea and the functionality is already there. But given this is a plugin config, I will not be able to utilize advanced functionality, like analyzing repos for context (which I really want to try out).

My goal is to run this in a home server that currently has 2x P40. While I could let one run tabby all the time and let the other for the rest, the card with model loaded consumes 40W more, producing unnecessary head and expense.

Is there any interest at all for this to be part of the core project? It would also meet the above mentioned requirements for VS code, since the server could run all the time, but consume little to no resources (after the model unloads, there's only the server waiting for connection, which I believe would be acceptable. Perhaps only laptop users might want to kill the server completely, but for this use case, the script might be an acceptable solution, since it's limited single-user use-case).

I'd be happy to try and implement a feature like that, provided the devs could point me to a sensible place to implement it.

matthiasgeihs added the enhancement New feature or request label Oct 24, 2023

wsxiaoys mentioned this issue Nov 1, 2023

support brew services #687

Closed

wsxiaoys mentioned this issue Nov 20, 2023

Load tabby on demand TabbyML/vim-tabby#3

Closed

wsxiaoys mentioned this issue Dec 20, 2023

Free up VRAM if not in use #1090

Closed

madsamjp mentioned this issue Dec 20, 2023

Ollama / Litellm support #1092

Closed

wsxiaoys mentioned this issue Mar 29, 2024

Idle timeout for releasing GPU memory #1736

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tabby VSCode Extension: Autostart Tabby Server #624

Tabby VSCode Extension: Autostart Tabby Server #624

matthiasgeihs commented Oct 24, 2023 •

edited

Loading

matthiasgeihs commented Oct 24, 2023

wsxiaoys commented Oct 24, 2023

wsxiaoys commented Oct 25, 2023

matthiasgeihs commented Oct 25, 2023

itlackey commented Oct 28, 2023

wsxiaoys commented Oct 28, 2023

limdingwen commented Jan 27, 2024

wsxiaoys commented Jan 28, 2024

bubundas17 commented Jul 15, 2024

wsxiaoys commented Jul 16, 2024

bubundas17 commented Jul 16, 2024

wsxiaoys commented Jul 16, 2024

bubundas17 commented Jul 16, 2024 •

edited

Loading

bubundas17 commented Jul 16, 2024

Krakonos commented Oct 25, 2024

Tabby VSCode Extension: Autostart Tabby Server #624

Tabby VSCode Extension: Autostart Tabby Server #624

Comments

matthiasgeihs commented Oct 24, 2023 • edited Loading

matthiasgeihs commented Oct 24, 2023

wsxiaoys commented Oct 24, 2023

wsxiaoys commented Oct 25, 2023

matthiasgeihs commented Oct 25, 2023

itlackey commented Oct 28, 2023

wsxiaoys commented Oct 28, 2023

limdingwen commented Jan 27, 2024

wsxiaoys commented Jan 28, 2024

bubundas17 commented Jul 15, 2024

wsxiaoys commented Jul 16, 2024

bubundas17 commented Jul 16, 2024

wsxiaoys commented Jul 16, 2024

bubundas17 commented Jul 16, 2024 • edited Loading

bubundas17 commented Jul 16, 2024

Krakonos commented Oct 25, 2024

matthiasgeihs commented Oct 24, 2023 •

edited

Loading

bubundas17 commented Jul 16, 2024 •

edited

Loading