[mypy] Enable following imports for entrypoints #7248

DarkLight1337 · 2024-08-07T06:10:22Z

This PR enables follow-imports=silent for vllm.entrypoints. It also cleans up some of the type annotations related to tokenizer (Union[PreTrainedTokenizer, PreTrainedTokenizerFast] is now replaced with AnyTokenizer).

Partially addresses #3680.

…ode in the process

DarkLight1337 · 2024-08-14T23:44:33Z

@rkooo567 please review when you have time.

DarkLight1337 · 2024-08-18T07:17:10Z

@petersalas can you help take a look why the audio entrypoints test now fails to pass pydantic validation?

petersalas · 2024-08-18T16:22:31Z

@DarkLight1337 can you elaborate a bit? I pulled the branch down but the audio tests seem to pass. Is there a call stack with the failure I can look at?

rkooo567

very cool! have the first iteration

vllm/engine/async_llm_engine.py

rkooo567 · 2024-08-18T18:56:12Z

vllm/entrypoints/chat_utils.py

@@ -164,16 +168,15 @@ def _parse_chat_message_content_parts(
    for part in parts:
        part_type = part["type"]
        if part_type == "text":
-            text = cast(ChatCompletionContentPartTextParam, part)["text"]
+            text = _TextParser.validate_python(part)["text"]


can you explain to me why we are doing this here instead of cast?

Imo this way is a bit more readable and also provides additional validation.

vllm/entrypoints/openai/api_server.py

vllm/entrypoints/openai/cli_args.py

rkooo567 · 2024-08-18T19:00:17Z

vllm/sampling_params.py

+            n=1 if n is None else n,
+            best_of=best_of,
+            presence_penalty=0.0
+            if presence_penalty is None else presence_penalty,


what is it exactly for? (also doesn't thits have duplicated logics with post_init? should we remove them?)

It is so that I can pass None as the arguments to SamplingParams without having to go to this file and lookup the default values.

DarkLight1337 · 2024-08-18T23:26:25Z

@DarkLight1337 can you elaborate a bit? I pulled the branch down but the audio tests seem to pass. Is there a call stack with the failure I can look at?

____________________________ test_single_chat_session_audio[https://upload.wikimedia.org/wikipedia/en/b/bf/Dave_Niehaus_Winning_Call_1995_AL_Division_Series.ogg-facebook/opt-125m] _____________________________

client = <openai.AsyncOpenAI object at 0x7f4d1a7715a0>, model_name = 'facebook/opt-125m', audio_url = 'https://upload.wikimedia.org/wikipedia/en/b/bf/Dave_Niehaus_Winning_Call_1995_AL_Division_Series.ogg'

    @pytest.mark.asyncio
    @pytest.mark.parametrize("model_name", [MODEL_NAME])
    @pytest.mark.parametrize("audio_url", TEST_AUDIO_URLS)
    async def test_single_chat_session_audio(client: openai.AsyncOpenAI,
                                             model_name: str, audio_url: str):
        messages = [{
            "role":
            "user",
            "content": [
                {
                    "type": "audio_url",
                    "audio_url": {
                        "url": audio_url
                    }
                },
                {
                    "type": "text",
                    "text": "What's happening in this audio?"
                },
            ],
        }]
    
        # test single completion
>       chat_completion = await client.chat.completions.create(model=model_name,
                                                               messages=messages,
                                                               max_tokens=10,
                                                               logprobs=True,
                                                               top_logprobs=5)

tests/entrypoints/openai/test_audio.py:169: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../miniconda3/envs/vllm/lib/python3.10/site-packages/openai/resources/chat/completions.py:1339: in create
    return await self._post(
../miniconda3/envs/vllm/lib/python3.10/site-packages/openai/_base_client.py:1816: in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
../miniconda3/envs/vllm/lib/python3.10/site-packages/openai/_base_client.py:1510: in request
    return await self._request(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <openai.AsyncOpenAI object at 0x7f4d1a7715a0>, cast_to = <class 'openai.types.chat.chat_completion.ChatCompletion'>
options = FinalRequestOptions(method='post', url='/chat/completions', params={}, headers=NOT_GIVEN, max_retries=NOT_GIVEN, timeo...his audio?"}]}], 'model': 'facebook/opt-125m', 'logprobs': True, 'max_tokens': 10, 'top_logprobs': 5}, extra_json=None)

    async def _request(
        self,
        cast_to: Type[ResponseT],
        options: FinalRequestOptions,
        *,
        stream: bool,
        stream_cls: type[_AsyncStreamT] | None,
        remaining_retries: int | None,
    ) -> ResponseT | _AsyncStreamT:
        if self._platform is None:
            # `get_platform` can make blocking IO calls so we
            # execute it earlier while we are in an async context
            self._platform = await asyncify(get_platform)()
    
        # create a copy of the options we were given so that if the
        # options are mutated later & we then retry, the retries are
        # given the original options
        input_options = model_copy(options)
    
        cast_to = self._maybe_override_cast_to(cast_to, options)
        options = await self._prepare_options(options)
    
        retries = self._remaining_retries(remaining_retries, options)
        request = self._build_request(options)
        await self._prepare_request(request)
    
        kwargs: HttpxSendArgs = {}
        if self.custom_auth is not None:
            kwargs["auth"] = self.custom_auth
    
        try:
            response = await self._client.send(
                request,
                stream=stream or self._should_stream_response_body(request=request),
                **kwargs,
            )
        except httpx.TimeoutException as err:
            log.debug("Encountered httpx.TimeoutException", exc_info=True)
    
            if retries > 0:
                return await self._retry_request(
                    input_options,
                    cast_to,
                    retries,
                    stream=stream,
                    stream_cls=stream_cls,
                    response_headers=None,
                )
    
            log.debug("Raising timeout error")
            raise APITimeoutError(request=request) from err
        except Exception as err:
            log.debug("Encountered Exception", exc_info=True)
    
            if retries > 0:
                return await self._retry_request(
                    input_options,
                    cast_to,
                    retries,
                    stream=stream,
                    stream_cls=stream_cls,
                    response_headers=None,
                )
    
            log.debug("Raising connection error")
            raise APIConnectionError(request=request) from err
    
        log.debug(
            'HTTP Request: %s %s "%i %s"', request.method, request.url, response.status_code, response.reason_phrase
        )
    
        try:
            response.raise_for_status()
        except httpx.HTTPStatusError as err:  # thrown on 4xx and 5xx status code
            log.debug("Encountered httpx.HTTPStatusError", exc_info=True)
    
            if retries > 0 and self._should_retry(err.response):
                await err.response.aclose()
                return await self._retry_request(
                    input_options,
                    cast_to,
                    retries,
                    err.response.headers,
                    stream=stream,
                    stream_cls=stream_cls,
                )
    
            # If the response is streamed then we need to explicitly read the response
            # to completion before attempting to access the response text.
            if not err.response.is_closed:
                await err.response.aread()
    
            log.debug("Re-raising status error")
>           raise self._make_status_error_from_response(err.response) from None
E           openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "6 validation errors for ValidatorIterator\n0.typed-dict.text\n  Field required [type=missing, input_value={'type': 'audio_url', 'au...L_Division_Series.ogg'}}, input_type=dict]\n    For further information visit https://errors.pydantic.dev/2.7/v/missing\n0.typed-dict.type\n  Input should be 'text' [type=literal_error, input_value='audio_url', input_type=str]\n    For further information visit https://errors.pydantic.dev/2.7/v/literal_error\n0.typed-dict.audio_url\n  Extra inputs are not permitted [type=extra_forbidden, input_value={'url': 'https://upload.w...AL_Division_Series.ogg'}, input_type=dict]\n    For further information visit https://errors.pydantic.dev/2.7/v/extra_forbidden\n0.typed-dict.image_url\n  Field required [type=missing, input_value={'type': 'audio_url', 'au...L_Division_Series.ogg'}}, input_type=dict]\n    For further information visit https://errors.pydantic.dev/2.7/v/missing\n0.typed-dict.type\n  Input should be 'image_url' [type=literal_error, input_value='audio_url', input_type=str]\n    For further information visit https://errors.pydantic.dev/2.7/v/literal_error\n0.typed-dict.audio_url\n  Extra inputs are not permitted [type=extra_forbidden, input_value={'url': 'https://upload.w...AL_Division_Series.ogg'}, input_type=dict]\n    For further information visit https://errors.pydantic.dev/2.7/v/extra_forbidden", 'type': 'BadRequestError', 'param': None, 'code': 400}

../miniconda3/envs/vllm/lib/python3.10/site-packages/openai/_base_client.py:1611: BadRequestError

The error is thrown server-side while iterating through each part in a message: https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/chat_utils.py#L164

I'm using pydantic 2.7.1 and pydantic_core 2.18.2, if that helps. Odd that it's not happening in CI now.

DarkLight1337 · 2024-08-19T04:37:31Z

@DarkLight1337 can you elaborate a bit? I pulled the branch down but the audio tests seem to pass. Is there a call stack with the failure I can look at?

The error is thrown server-side while iterating through each part in a message: https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/chat_utils.py#L164

I'm using pydantic 2.7.1 and pydantic_core 2.18.2, if that helps. Odd that it's not happening in CI now.

Maybe this is a bug in Pydantic. I updated to pydantic 2.8.2 and pydantic-core 2.20.1 and don't get this error anymore. Sorry for bothering you!

rkooo567

lgtm!

Co-authored-by: Woosuk Kwon <[email protected]> Co-authored-by: Fei <[email protected]>

Co-authored-by: Woosuk Kwon <[email protected]> Co-authored-by: Fei <[email protected]> Signed-off-by: Alvant <[email protected]>

DarkLight1337 and others added 30 commits July 31, 2024 03:14

Add entrypoints to stricter checks

51a4628

Fix swap_space and cpu_offload_gb only accepting ints; clean up c…

8ab3ba9

…ode in the process

Update mypy version

7efaa82

Improve typing of tokenizer and hf config

e5b6784

Fix encoding_format

2e0fa85

Fix misc.

e1f6d4f

[Bugfix][TPU] Set readonly=True for non-root devices (#6980)

625e11f

[Bugfix] fix logit processor excceed vocab size issue (#6927)

fb19d3e

Fix errors when construct sampling params

ad9358c

Improve types + format

ba499d0

Handle decoded_token=None + format

c596ac9

Merge branch 'upstream' into typing-entrypoints

fcf734f

Merge branch 'upstream' into typing-entrypoints

711f0ad

Merge branch 'upstream' into typing-entrypoints

112224b

Fix type errors

bebad8c

Make decorators typed

50a1136

Format

3b0ac79

Fix type errors

eba3863

Merge branch 'upstream' into typing-entrypoints

3ba6c44

Fix type errors from merged commits

ff49909

Merge branch 'upstream' into typing-entrypoints

f348958

Use more flexible tokenizer type

4f80738

Fix arg

ffe97d6

Merge branch 'upstream' into typing-entrypoints

1e1db1c

Merge branch 'upstream' into typing-entrypoints

109cb1f

Fix merge

0e4c97d

Merge branch 'upstream' into typing-entrypoints

c959a1d

Remove unnecessary type annotations

8291a9d

Simplify code

46732b8

Cleanup

37ab834

DarkLight1337 added 2 commits August 14, 2024 16:18

Merge branch 'upstream' into typing-entrypoints

1e14f12

Add type annotation

b9e8f00

DarkLight1337 added 6 commits August 16, 2024 04:09

Merge branch 'upstream' into typing-entrypoints

019981b

Clean up validation logic

516aa3b

Update tests

f4af304

Merge branch 'upstream' into typing-entrypoints

1a81806

Fix type error

9cd8fb5

Clean up parsing logic

7e09fc8

DarkLight1337 removed the ready ONLY add when PR is ready to merge/full CI is needed label Aug 18, 2024

format

b7dc954

rkooo567 reviewed Aug 18, 2024

View reviewed changes

DarkLight1337 added 4 commits August 18, 2024 23:26

Remote quotes

78161b4

Add fallback

0a9274a

Update tests

1e89169

Move chat tests to the correct file

e2ec43c

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 19, 2024

Merge branch 'upstream' into typing-entrypoints

72145fe

Update pydantic version

1f9ea92

rkooo567 approved these changes Aug 20, 2024

View reviewed changes

Merge branch 'upstream' into typing-entrypoints

60b8aeb

youkaichao merged commit baaedfd into vllm-project:main Aug 21, 2024
62 of 65 checks passed

DarkLight1337 deleted the typing-entrypoints branch August 21, 2024 06:33

omrishiv pushed a commit to omrishiv/vllm that referenced this pull request Aug 26, 2024

[mypy] Enable following imports for entrypoints (vllm-project#7248)

5c225f8

Co-authored-by: Woosuk Kwon <[email protected]> Co-authored-by: Fei <[email protected]>

DarkLight1337 mentioned this pull request Oct 9, 2024

[CI/Build] mypy: check vllm/entrypoints #9194

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mypy] Enable following imports for entrypoints #7248

[mypy] Enable following imports for entrypoints #7248

DarkLight1337 commented Aug 7, 2024 •

edited

Loading

DarkLight1337 commented Aug 14, 2024

DarkLight1337 commented Aug 18, 2024

petersalas commented Aug 18, 2024

rkooo567 left a comment

rkooo567 Aug 18, 2024

DarkLight1337 Aug 18, 2024

rkooo567 Aug 18, 2024

DarkLight1337 Aug 18, 2024 •

edited

Loading

DarkLight1337 commented Aug 18, 2024 •

edited

Loading

DarkLight1337 commented Aug 19, 2024 •

edited

Loading

rkooo567 left a comment

[mypy] Enable following imports for entrypoints #7248

[mypy] Enable following imports for entrypoints #7248

Conversation

DarkLight1337 commented Aug 7, 2024 • edited Loading

DarkLight1337 commented Aug 14, 2024

DarkLight1337 commented Aug 18, 2024

petersalas commented Aug 18, 2024

rkooo567 left a comment

Choose a reason for hiding this comment

rkooo567 Aug 18, 2024

Choose a reason for hiding this comment

DarkLight1337 Aug 18, 2024

Choose a reason for hiding this comment

rkooo567 Aug 18, 2024

Choose a reason for hiding this comment

DarkLight1337 Aug 18, 2024 • edited Loading

Choose a reason for hiding this comment

DarkLight1337 commented Aug 18, 2024 • edited Loading

DarkLight1337 commented Aug 19, 2024 • edited Loading

rkooo567 left a comment

Choose a reason for hiding this comment

DarkLight1337 commented Aug 7, 2024 •

edited

Loading

DarkLight1337 Aug 18, 2024 •

edited

Loading

DarkLight1337 commented Aug 18, 2024 •

edited

Loading

DarkLight1337 commented Aug 19, 2024 •

edited

Loading