[Frontend] Support complex message content for chat completions endpoint #3467

fgreinacher · 2024-03-18T16:35:30Z

The vLLM OpenAI server currently does not support complex message contents for the chat completions endpoint:

    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What’s in this image?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
            }
          }
        ]
      }
    ]

It seems non-trivial/impossible to fully support this format because it strongly depends on the active model.

What we can support easily though are simple cases where just a simple text content is provided:

    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What is LiteLLM?"
          }
        ]
      }
    ]

This does not seem super useful, but it helps for cases where a client library passes complex contents by default, even if they could be represented by a simple string, for example https://github.com/OkGoDoIt/OpenAI-API-dotnet.

⚒️ with ❤️ by Siemens

PR Checklist (Click to expand. Please read before submitting.)

Thank you for your contribution to vLLM! Before submitting the pull request, please ensure the PR meets the following criteria. This helps vLLM maintain the code quality and improve the efficiency of the review process.

PR Title and Classification

Only specific types of PRs will be reviewed. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:

[Bugfix] for bug fixes.
[CI/Build] for build or continuous integration improvements.
[Doc] for documentation fixes and improvements.
[Model] for adding a new model or improving an existing model. Model name should appear in the title.
[Frontend] For changes on the vLLM frontend (e.g., OpenAI API server, LLM class, etc.)
[Kernel] for changes affecting CUDA kernels or other compute kernels.
[Core] for changes in the core vLLM logic (e.g., LLMEngine, AsyncLLMEngine, Scheduler, etc.)
[Hardware][Vendor] for hardware-specific changes. Vendor name should appear in the prefix (e.g., [Hardware][AMD]).
[Misc] for PRs that do not fit the above categories. Please use this sparingly.

Note: If the PR spans more than one category, please include all relevant prefixes.

Code Quality

The PR need to meet the following code quality standards:

We adhere to Google Python style guide and Google C++ style guide.
Pass all linter checks. Please use format.sh to format your code.
The code need to be well-documented to ensure future contributors can easily understand the code.
Include sufficient tests to ensure the project to stay correct and robust. This includes both unit tests and integration tests.
Please add documentation to docs/source/ if the PR modifies the user-facing behaviors of vLLM. It helps vLLM user understand and utilize the new features or changes.

Notes for Large Changes

Please keep the changes as concise as possible. For major architectural changes (>500 LOC excluding kernel/data/config/test), we would expect a GitHub issue (RFC) discussing the technical design and justification. Otherwise, we will tag it with rfc-required and might not go through the PR.

What to Expect for the Reviews

The goal of the vLLM team is to be a transparent reviewing machine. We would like to make the review process transparent and efficient and make sure no contributor feel confused or frustrated. However, the vLLM team is small, so we need to prioritize some PRs over others. Here is what you can expect from the review process:

After the PR is submitted, the PR will be assigned to a reviewer. Every reviewer will pick up the PRs based on their expertise and availability.
After the PR is assigned, the reviewer will provide status update every 2-3 days. If the PR is not reviewed within 7 days, please feel free to ping the reviewer or the vLLM team.
After the review, the reviewer will put an action-required label on the PR if there are changes required. The contributor should address the comments and ping the reviewer to re-review the PR.
Please respond to all comments within a reasonable time frame. If a comment isn't clear or you disagree with a suggestion, feel free to ask for clarification or discuss the suggestion.

Thank You

Finally, thank you for taking the time to read these guidelines and for your interest in contributing to vLLM. Your contributions make vLLM a great tool for everyone!

Please provide a brief explanation of the motivation behind the PR and the changes it introduces. This helps reviewers understand the context and rationale for the contribution. If possible, please link existing issues this PR will resolve.

fgreinacher · 2024-03-19T10:30:24Z

This is now ready for review.

There is failure in the Kernels test, but it does not seem to be related to my changes.

LiuXiaoxuanPKU

Thanks for the contribution! Just some minor comments.

tests/entrypoints/test_openai_server.py

vllm/entrypoints/openai/protocol.py

simon-mo · 2024-03-28T17:22:07Z

nice! i don't even though this schema existed in the first place

simon-mo

i actually think the proper fix would be support this in serving_completions directly to unwrap the text if complex schema is provided. we just added llava support and i think someone can add images as a follow up.

tests/entrypoints/test_openai_server.py

vllm/entrypoints/openai/protocol.py

fgreinacher · 2024-04-02T05:36:40Z

i actually think the proper fix would be support this in serving_completions directly to unwrap the text if complex schema is provided. we just added llava support and i think someone can add images as a follow up.

@simon-mo I had the same idea at first, but then I realized that this is not support by the transformers library (https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/serving_chat.py#L55-L58). My thinking was that passing the complex format on therefore does not bring much benefit and just weakens typing. Full support could for sure be implemented, but would need quite some effort. Maybe we can start with the simple solution and evolve as needed?

fgreinacher · 2024-04-23T06:23:23Z

vllm/entrypoints/openai/protocol.py

+    @computed_field
+    @property
+    def normalized_messages(self) -> List[Dict[str, str]]:
+        return [{
+            key: value if isinstance(value, str) else value[0].text
+            for key, value in message.items()
+        } for message in self.messages]


I decided to keep this here so that we can benefit from Pydantic convenience. Let me know if you prefer to have it somewhere else.

DarkLight1337 · 2024-04-25T07:05:30Z

Just a heads-up that #4355 uses the official type definitions from the openai Python library. This ensures consistency with using openai.Client to access the server. I think there is no need to maintain our own type definitions for the message inputs.

fgreinacher · 2024-04-25T07:23:28Z

Just a heads-up that #4355 uses the official type definitions from the openai Python library. This ensures consistency with using openai.Client to access the server. I think there is no need to maintain our own type definitions for the message inputs.

Oh, that's very nice @DarkLight1337. Also great that you already provide a placeholder for the changes in this MR :)

I'll put this MR back to Draft and will refactor once #4355 is merged.

fgreinacher · 2024-04-29T11:10:11Z

Just a heads-up that #4355 uses the official type definitions from the openai Python library. This ensures consistency with using openai.Client to access the server. I think there is no need to maintain our own type definitions for the message inputs.

Oh, that's very nice @DarkLight1337. Also great that you already provide a placeholder for the changes in this MR :)

I'll put this MR back to Draft and will refactor once #4355 is merged.

I have rebased and adapted the code, making this one much smaller, thanks @DarkLight1337 👍

@simon-mo This is ready for another round I'd say. The Neuron test is failing, but I don't think it's related to the changes in this PR.

DarkLight1337 · 2024-04-29T11:15:01Z

I have rebased and adapted the code, making this one much smaller, thanks @DarkLight1337 👍

@simon-mo This is ready for another round I'd say. The Neuron test is failing, but I don't think it's related to the changes in this PR.

Glad to help! I think you can take more parts from #4200 and enable multiple text inputs per message (concatenating them with a newline). That way, we get to test out this functionality a bit more before extending it to images.

Co-authored-by: Lily Liu <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

fgreinacher · 2024-04-29T13:28:46Z

I have rebased and adapted the code, making this one much smaller, thanks @DarkLight1337 👍
@simon-mo This is ready for another round I'd say. The Neuron test is failing, but I don't think it's related to the changes in this PR.

Glad to help! I think you can take more parts from #4200 and enable multiple text inputs per message (concatenating them with a newline). That way, we get to test out this functionality a bit more before extending it to images.

Wonderful, I took the relevant part and integrated it here 🙇

…int (vllm-project#3467) Co-authored-by: Lily Liu <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

fgreinacher force-pushed the feat/complex-message-content branch 9 times, most recently from f538ecd to 46eaae8 Compare March 19, 2024 10:09

fgreinacher marked this pull request as ready for review March 19, 2024 10:22

LiuXiaoxuanPKU self-assigned this Mar 22, 2024

LiuXiaoxuanPKU approved these changes Mar 25, 2024

View reviewed changes

tests/entrypoints/test_openai_server.py Outdated Show resolved Hide resolved

vllm/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

LiuXiaoxuanPKU reviewed Mar 25, 2024

View reviewed changes

vllm/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

fgreinacher requested a review from LiuXiaoxuanPKU March 25, 2024 08:09

fgreinacher force-pushed the feat/complex-message-content branch 5 times, most recently from abf69fc to a7d51e4 Compare March 28, 2024 07:00

simon-mo requested changes Mar 28, 2024

View reviewed changes

tests/entrypoints/test_openai_server.py Outdated Show resolved Hide resolved

vllm/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

simon-mo self-assigned this Mar 28, 2024

LiuXiaoxuanPKU removed their request for review March 29, 2024 05:12

fgreinacher force-pushed the feat/complex-message-content branch 3 times, most recently from 319de06 to e38105f Compare April 2, 2024 05:32

fgreinacher requested a review from simon-mo April 3, 2024 09:47

simon-mo mentioned this pull request Apr 8, 2024

Propose an update to the structure of the 'message' protocol. #3907

Closed

hmellor mentioned this pull request Apr 20, 2024

[Bug]: Error with OpenAI server: API request failed with status code 400 #3906

Closed

fgreinacher force-pushed the feat/complex-message-content branch 3 times, most recently from be117f2 to f636f1a Compare April 21, 2024 14:10

fgreinacher requested a review from simon-mo April 23, 2024 06:21

fgreinacher commented Apr 23, 2024

View reviewed changes

fgreinacher force-pushed the feat/complex-message-content branch 2 times, most recently from c43d1ec to 905717e Compare April 23, 2024 06:41

DarkLight1337 mentioned this pull request Apr 25, 2024

[Bug]: Invalid inputs do not result in error #4339

Closed

fgreinacher marked this pull request as draft April 25, 2024 07:23

fgreinacher force-pushed the feat/complex-message-content branch 2 times, most recently from e785dec to 068b2a7 Compare April 29, 2024 10:40

feat: support complex message content for chat completions endpoint

2398646

Co-authored-by: Lily Liu <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

fgreinacher force-pushed the feat/complex-message-content branch from 068b2a7 to 2398646 Compare April 29, 2024 12:51

fgreinacher marked this pull request as ready for review April 29, 2024 13:27

simon-mo approved these changes Apr 30, 2024

View reviewed changes

simon-mo merged commit a494140 into vllm-project:main Apr 30, 2024
45 of 48 checks passed

fgreinacher deleted the feat/complex-message-content branch May 2, 2024 08:53

dtrifiro mentioned this pull request May 15, 2024

bump ubi base image tag opendatahub-io/vllm#24

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Frontend] Support complex message content for chat completions endpoint #3467

[Frontend] Support complex message content for chat completions endpoint #3467

fgreinacher commented Mar 18, 2024 •

edited

Loading

fgreinacher commented Mar 19, 2024

LiuXiaoxuanPKU left a comment

simon-mo commented Mar 28, 2024

simon-mo left a comment

fgreinacher commented Apr 2, 2024 •

edited

Loading

fgreinacher Apr 23, 2024 •

edited

Loading

DarkLight1337 commented Apr 25, 2024 •

edited

Loading

fgreinacher commented Apr 25, 2024 •

edited

Loading

fgreinacher commented Apr 29, 2024 •

edited

Loading

DarkLight1337 commented Apr 29, 2024 •

edited

Loading

fgreinacher commented Apr 29, 2024

[Frontend] Support complex message content for chat completions endpoint #3467

[Frontend] Support complex message content for chat completions endpoint #3467

Conversation

fgreinacher commented Mar 18, 2024 • edited Loading

PR Title and Classification

Code Quality

Notes for Large Changes

What to Expect for the Reviews

Thank You

fgreinacher commented Mar 19, 2024

LiuXiaoxuanPKU left a comment

Choose a reason for hiding this comment

simon-mo commented Mar 28, 2024

simon-mo left a comment

Choose a reason for hiding this comment

fgreinacher commented Apr 2, 2024 • edited Loading

fgreinacher Apr 23, 2024 • edited Loading

Choose a reason for hiding this comment

DarkLight1337 commented Apr 25, 2024 • edited Loading

fgreinacher commented Apr 25, 2024 • edited Loading

fgreinacher commented Apr 29, 2024 • edited Loading

DarkLight1337 commented Apr 29, 2024 • edited Loading

fgreinacher commented Apr 29, 2024

fgreinacher commented Mar 18, 2024 •

edited

Loading

fgreinacher commented Apr 2, 2024 •

edited

Loading

fgreinacher Apr 23, 2024 •

edited

Loading

DarkLight1337 commented Apr 25, 2024 •

edited

Loading

fgreinacher commented Apr 25, 2024 •

edited

Loading

fgreinacher commented Apr 29, 2024 •

edited

Loading

DarkLight1337 commented Apr 29, 2024 •

edited

Loading