-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Doc] Fix getting stared to use publicly available model #3963
Conversation
Previous model `meta/llama-2` required special access to the model card on hugging face leading to an exception when starting the reference example. The new model proposed for the getting started -mistralai/Mistral-7B-Instruct-v0.2- is publicly available and runs on commodity hardware Also update dead link
@@ -4,7 +4,7 @@ vLLM provides an HTTP server that implements OpenAI's [Completions](https://plat | |||
|
|||
You can start the server using Python, or using [Docker](deploying_with_docker.rst): | |||
```bash | |||
python -m vllm.entrypoints.openai.api_server --model meta-llama/Llama-2-7b-hf --dtype float32 --api-key token-abc123 | |||
python -m vllm.entrypoints.openai.api_server --model mistralai/Mistral-7B-Instruct-v0.2 --dtype auto --api-key token-abc123 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Let's remove --api-key
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to keep the api key arg. Its a low overhead way to ensure basics of access control, especially given the prive of GPU. Having the api key in the QuickStart promotes security best practices (so many quick start configs end up being the production ones..) let's make this one right :)
messages=[ | ||
{"role": "system", "content": "You are a helpful assistant."}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch
Previous model
meta/llama-2
required special access to the model card on hugging face leading to an exception when starting the reference example.The access error with
meta-llama/Llama-2-7b-hf
was:The new model proposed for the getting started -
mistralai/Mistral-7B-Instruct-v0.2
- is publicly available and runs on commodity hardware. The getting started proposed in the vLLM Quickstart now works.Also update dead link