From 76a120c0247e9ed79d027206ee37080d9b63d357 Mon Sep 17 00:00:00 2001 From: Roger Wang Date: Fri, 5 Jul 2024 21:02:29 -0700 Subject: [PATCH 1/4] update doc --- docs/source/models/supported_models.rst | 31 ++++++++++++++++++++++++- docs/source/models/vlm.rst | 3 ++- 2 files changed, 32 insertions(+), 2 deletions(-) diff --git a/docs/source/models/supported_models.rst b/docs/source/models/supported_models.rst index 0283f36ea52b8..ef5c9d61bf84a 100644 --- a/docs/source/models/supported_models.rst +++ b/docs/source/models/supported_models.rst @@ -7,6 +7,8 @@ vLLM supports a variety of generative Transformer models in `HuggingFace Transfo The following is the list of model architectures that are currently supported by vLLM. Alongside each architecture, we include some popular models that use it. +Decoder-only Language Models +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. list-table:: :widths: 25 25 50 5 :header-rows: 1 @@ -173,6 +175,32 @@ Alongside each architecture, we include some popular models that use it. - +.. _supported_vlms: + +Vision Language Models +^^^^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :widths: 25 25 50 5 + :header-rows: 1 + + * - Architecture + - Models + - Example HuggingFace Models + - :ref:`LoRA ` + * - :code:`LlavaForConditionalGeneration` + - LLaVA-1.5 + - :code:`llava-hf/llava-1.5-7b-hf`, :code:`llava-hf/llava-1.5-13b-hf`, etc. + - + * - :code:`LlavaNextForConditionalGeneration` + - LLaVA-NeXT + - :code:`llava-hf/llava-v1.6-mistral-7b-hf`, :code:`llava-hf/llava-v1.6-vicuna-7b-hf`, etc. + - + * - :code:`Phi3VForCausalLM` + - Phi-3-Vision + - :code:`microsoft/Phi-3-vision-128k-instruct`, etc. + - + If your model uses one of the above model architectures, you can seamlessly run your model with vLLM. Otherwise, please refer to :ref:`Adding a New Model ` for instructions on how to implement support for your model. Alternatively, you can raise an issue on our `GitHub `_ project. @@ -210,8 +238,9 @@ Alternatively, you can raise an issue on our `GitHub `. +This document shows you how to run and serve these models using vLLM. .. important:: We are actively iterating on VLM support. Expect breaking changes to VLM usage and development in upcoming releases without prior deprecation. From 99f9bf9f59f350f54d2298309d3f2e00442cc69e Mon Sep 17 00:00:00 2001 From: Roger Wang Date: Fri, 5 Jul 2024 21:07:07 -0700 Subject: [PATCH 2/4] update --- docs/source/models/supported_models.rst | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/docs/source/models/supported_models.rst b/docs/source/models/supported_models.rst index ef5c9d61bf84a..cce41d12813a8 100644 --- a/docs/source/models/supported_models.rst +++ b/docs/source/models/supported_models.rst @@ -97,14 +97,6 @@ Decoder-only Language Models - LLaMA, Llama 2, Meta Llama 3, Vicuna, Alpaca, Yi - :code:`meta-llama/Meta-Llama-3-8B-Instruct`, :code:`meta-llama/Meta-Llama-3-70B-Instruct`, :code:`meta-llama/Llama-2-13b-hf`, :code:`meta-llama/Llama-2-70b-hf`, :code:`openlm-research/open_llama_13b`, :code:`lmsys/vicuna-13b-v1.3`, :code:`01-ai/Yi-6B`, :code:`01-ai/Yi-34B`, etc. - ✅︎ - * - :code:`LlavaForConditionalGeneration` - - LLaVA-1.5 - - :code:`llava-hf/llava-1.5-7b-hf`, :code:`llava-hf/llava-1.5-13b-hf`, etc. - - - * - :code:`LlavaNextForConditionalGeneration` - - LLaVA-NeXT - - :code:`llava-hf/llava-v1.6-mistral-7b-hf`, :code:`llava-hf/llava-v1.6-vicuna-7b-hf`, etc. - - * - :code:`MiniCPMForCausalLM` - MiniCPM - :code:`openbmb/MiniCPM-2B-sft-bf16`, :code:`openbmb/MiniCPM-2B-dpo-bf16`, etc. @@ -145,10 +137,6 @@ Decoder-only Language Models - Phi-3-Small - :code:`microsoft/Phi-3-small-8k-instruct`, :code:`microsoft/Phi-3-small-128k-instruct`, etc. - - * - :code:`Phi3VForCausalLM` - - Phi-3-Vision - - :code:`microsoft/Phi-3-vision-128k-instruct`, etc. - - * - :code:`QWenLMHeadModel` - Qwen - :code:`Qwen/Qwen-7B`, :code:`Qwen/Qwen-7B-Chat`, etc. From 4360d4fe651c6e6dbcde7416b1885d733d758397 Mon Sep 17 00:00:00 2001 From: Roger Wang Date: Fri, 5 Jul 2024 21:25:30 -0700 Subject: [PATCH 3/4] move note --- docs/source/models/supported_models.rst | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/docs/source/models/supported_models.rst b/docs/source/models/supported_models.rst index cce41d12813a8..767d9316e9c14 100644 --- a/docs/source/models/supported_models.rst +++ b/docs/source/models/supported_models.rst @@ -162,6 +162,8 @@ Decoder-only Language Models - :code:`xverse/XVERSE-7B-Chat`, :code:`xverse/XVERSE-13B-Chat`, :code:`xverse/XVERSE-65B-Chat`, etc. - +.. note:: + Currently, the ROCm version of vLLM supports Mistral and Mixtral only for context lengths up to 4096. .. _supported_vlms: @@ -193,9 +195,6 @@ If your model uses one of the above model architectures, you can seamlessly run Otherwise, please refer to :ref:`Adding a New Model ` for instructions on how to implement support for your model. Alternatively, you can raise an issue on our `GitHub `_ project. -.. note:: - Currently, the ROCm version of vLLM supports Mistral and Mixtral only for context lengths up to 4096. - .. tip:: The easiest way to check if your model is supported is to run the program below: From b343f8d8953d0ebcd1cc9b8a5549e9209625a679 Mon Sep 17 00:00:00 2001 From: Roger Wang Date: Fri, 5 Jul 2024 21:35:08 -0700 Subject: [PATCH 4/4] update doc --- docs/source/models/supported_models.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/source/models/supported_models.rst b/docs/source/models/supported_models.rst index 767d9316e9c14..f5511580d1957 100644 --- a/docs/source/models/supported_models.rst +++ b/docs/source/models/supported_models.rst @@ -192,7 +192,8 @@ Vision Language Models - If your model uses one of the above model architectures, you can seamlessly run your model with vLLM. -Otherwise, please refer to :ref:`Adding a New Model ` for instructions on how to implement support for your model. +Otherwise, please refer to :ref:`Adding a New Model ` and :ref:`Adding a New Multimodal Model ` +for instructions on how to implement support for your model. Alternatively, you can raise an issue on our `GitHub `_ project. .. tip::