Skip to content

Commit

Permalink
[Doc] Add notice about breaking changes to VLMs (vllm-project#5818)
Browse files Browse the repository at this point in the history
  • Loading branch information
DarkLight1337 authored and jimpang committed Jul 24, 2024
1 parent 9f93979 commit 5e6b49c
Showing 1 changed file with 13 additions and 0 deletions.
13 changes: 13 additions & 0 deletions docs/source/models/vlm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ Using VLMs

vLLM provides experimental support for Vision Language Models (VLMs). This document shows you how to run and serve these models using vLLM.

.. important::
We are actively iterating on VLM support. Expect breaking changes to VLM usage and development in upcoming releases without prior deprecation.

Engine Arguments
----------------

Expand Down Expand Up @@ -39,6 +42,10 @@ To initialize a VLM, the aforementioned arguments must be passed to the ``LLM``
image_feature_size=576,
)
.. important::
We will remove most of the vision-specific arguments in a future release as they can be inferred from the HuggingFace configuration.


To pass an image to the model, note the following in :class:`vllm.inputs.PromptStrictInputs`:

* ``prompt``: The prompt should have a number of ``<image>`` tokens equal to ``image_feature_size``.
Expand All @@ -63,6 +70,9 @@ To pass an image to the model, note the following in :class:`vllm.inputs.PromptS
A code example can be found in `examples/llava_example.py <https://github.com/vllm-project/vllm/blob/main/examples/llava_example.py>`_.

.. important::
We will remove the need to format image tokens in a future release. Afterwards, the input text will follow the same format as that for the original HuggingFace model.

Online OpenAI Vision API Compatible Inference
----------------------------------------------

Expand All @@ -89,6 +99,9 @@ Below is an example on how to launch the same ``llava-hf/llava-1.5-7b-hf`` with
--image-feature-size 576 \
--chat-template template_llava.jinja
.. important::
We will remove most of the vision-specific arguments in a future release as they can be inferred from the HuggingFace configuration.

To consume the server, you can use the OpenAI client like in the example below:

.. code-block:: python
Expand Down

0 comments on commit 5e6b49c

Please sign in to comment.