[Doc] Fix getting stared to use publicly available model #3963

fpaupier · 2024-04-10T06:59:00Z

Previous model meta/llama-2 required special access to the model card on hugging face leading to an exception when starting the reference example.

The access error with meta-llama/Llama-2-7b-hf was:

OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/meta-llama/Llama-2-7b-hf.
401 Client Error. (Request ID: Root=1-66163835-71dc8d373b3129360016da39;ee08d21c-507d-43e7-b66b-04e9b45563ca)

Cannot access gated repo for url https://huggingface.co/meta-llama/Llama-2-7b-hf/resolve/main/config.json.
Repo model meta-llama/Llama-2-7b-hf is gated. You must be authenticated to access it.

The new model proposed for the getting started - mistralai/Mistral-7B-Instruct-v0.2 - is publicly available and runs on commodity hardware. The getting started proposed in the vLLM Quickstart now works.

Also update dead link

Previous model `meta/llama-2` required special access to the model card on hugging face leading to an exception when starting the reference example. The new model proposed for the getting started -mistralai/Mistral-7B-Instruct-v0.2- is publicly available and runs on commodity hardware Also update dead link

WoosukKwon · 2024-04-10T08:10:19Z

docs/source/serving/openai_compatible_server.md

@@ -4,7 +4,7 @@ vLLM provides an HTTP server that implements OpenAI's [Completions](https://plat

 You can start the server using Python, or using [Docker](deploying_with_docker.rst):
 ```bash
-python -m vllm.entrypoints.openai.api_server --model meta-llama/Llama-2-7b-hf --dtype float32 --api-key token-abc123
+python -m vllm.entrypoints.openai.api_server --model mistralai/Mistral-7B-Instruct-v0.2 --dtype auto --api-key token-abc123


nit: Let's remove --api-key?

I suggest to keep the api key arg. Its a low overhead way to ensure basics of access control, especially given the prive of GPU. Having the api key in the QuickStart promotes security best practices (so many quick start configs end up being the production ones..) let's make this one right :)

simon-mo · 2024-04-10T18:01:41Z

docs/source/serving/openai_compatible_server.md

  messages=[
-    {"role": "system", "content": "You are a helpful assistant."},


…t#3963)

WoosukKwon reviewed Apr 10, 2024

View reviewed changes

simon-mo approved these changes Apr 10, 2024

View reviewed changes

simon-mo enabled auto-merge (squash) April 10, 2024 18:01

simon-mo reviewed Apr 10, 2024

View reviewed changes

simon-mo merged commit 92cd2e2 into vllm-project:main Apr 10, 2024
35 checks passed

SageMoore pushed a commit to neuralmagic/nm-vllm that referenced this pull request Apr 11, 2024

[Doc] Fix getting stared to use publicly available model (vllm-projec…

dfbd725

…t#3963)

andy-neuma pushed a commit to neuralmagic/nm-vllm that referenced this pull request Apr 12, 2024

[Doc] Fix getting stared to use publicly available model (vllm-projec…

71cbad1

…t#3963)

z103cb pushed a commit to z103cb/opendatahub_vllm that referenced this pull request Apr 22, 2024

[Doc] Fix getting stared to use publicly available model (vllm-projec…

9322277

…t#3963)

dtrifiro mentioned this pull request May 15, 2024

bump ubi base image tag opendatahub-io/vllm#24

Merged

Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024

[Doc] Fix getting stared to use publicly available model (vllm-projec…

1c757b9

…t#3963)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Doc] Fix getting stared to use publicly available model #3963

[Doc] Fix getting stared to use publicly available model #3963

fpaupier commented Apr 10, 2024

WoosukKwon Apr 10, 2024

fpaupier Apr 10, 2024

simon-mo Apr 10, 2024

		messages=[
		{"role": "system", "content": "You are a helpful assistant."},

[Doc] Fix getting stared to use publicly available model #3963

[Doc] Fix getting stared to use publicly available model #3963

Conversation

fpaupier commented Apr 10, 2024

WoosukKwon Apr 10, 2024

Choose a reason for hiding this comment

fpaupier Apr 10, 2024

Choose a reason for hiding this comment

simon-mo Apr 10, 2024

Choose a reason for hiding this comment