Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mypy] Enable following imports for entrypoints #7248

Merged
merged 107 commits into from
Aug 21, 2024
Merged
Show file tree
Hide file tree
Changes from 106 commits
Commits
Show all changes
107 commits
Select commit Hold shift + click to select a range
51a4628
Add entrypoints to stricter checks
DarkLight1337 Jul 31, 2024
8ab3ba9
Fix `swap_space` and `cpu_offload_gb` only accepting ints; clean up c…
DarkLight1337 Jul 31, 2024
7efaa82
Update mypy version
DarkLight1337 Jul 31, 2024
e5b6784
Improve typing of tokenizer and hf config
DarkLight1337 Jul 31, 2024
2e0fa85
Fix `encoding_format`
DarkLight1337 Jul 31, 2024
e1f6d4f
Fix misc.
DarkLight1337 Jul 31, 2024
625e11f
[Bugfix][TPU] Set readonly=True for non-root devices (#6980)
WoosukKwon Jul 31, 2024
fb19d3e
[Bugfix] fix logit processor excceed vocab size issue (#6927)
FeiDeng Jul 31, 2024
ad9358c
Fix errors when construct sampling params
DarkLight1337 Jul 31, 2024
ba499d0
Improve types + format
DarkLight1337 Jul 31, 2024
c596ac9
Handle `decoded_token=None` + format
DarkLight1337 Jul 31, 2024
fcf734f
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Jul 31, 2024
711f0ad
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Jul 31, 2024
112224b
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 3, 2024
bebad8c
Fix type errors
DarkLight1337 Aug 3, 2024
50a1136
Make decorators typed
DarkLight1337 Aug 3, 2024
3b0ac79
Format
DarkLight1337 Aug 3, 2024
eba3863
Fix type errors
DarkLight1337 Aug 3, 2024
3ba6c44
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 4, 2024
ff49909
Fix type errors from merged commits
DarkLight1337 Aug 4, 2024
f348958
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 4, 2024
4f80738
Use more flexible tokenizer type
DarkLight1337 Aug 4, 2024
ffe97d6
Fix arg
DarkLight1337 Aug 4, 2024
1e1db1c
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 5, 2024
109cb1f
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 7, 2024
0e4c97d
Fix merge
DarkLight1337 Aug 7, 2024
c959a1d
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 7, 2024
8291a9d
Remove unnecessary type annotations
DarkLight1337 Aug 7, 2024
46732b8
Simplify code
DarkLight1337 Aug 7, 2024
37ab834
Cleanup
DarkLight1337 Aug 7, 2024
2da334c
Fix type errors
DarkLight1337 Aug 7, 2024
475d84a
Fix type error
DarkLight1337 Aug 7, 2024
937a8ca
Clean
DarkLight1337 Aug 7, 2024
33c9e25
Introduce `is_list_of`
DarkLight1337 Aug 7, 2024
e6dd6f5
Avoid circular imports
DarkLight1337 Aug 7, 2024
f938c86
Refactor prompt parsing and extend this to async engine
DarkLight1337 Aug 7, 2024
6332d1e
Remove unnecessary comments
DarkLight1337 Aug 7, 2024
07b4d21
Enable full async
DarkLight1337 Aug 7, 2024
e29864c
grammar
DarkLight1337 Aug 7, 2024
c9dfb40
Add description
DarkLight1337 Aug 7, 2024
1233192
Fix wrong type annotations
DarkLight1337 Aug 7, 2024
f332275
Merge branch 'upstream' into inputs-parser
DarkLight1337 Aug 7, 2024
58ca741
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 7, 2024
dcdebee
Remove redundant docs
DarkLight1337 Aug 7, 2024
65db3f1
Be more strict
DarkLight1337 Aug 7, 2024
9ffeb22
Fix docs
DarkLight1337 Aug 7, 2024
c9e0b08
Fix 2
DarkLight1337 Aug 7, 2024
14bca1f
Disallow multi-modal data for enc/dec models
DarkLight1337 Aug 7, 2024
8fc7099
Improve type narrowing behavior using `TypeIs`
DarkLight1337 Aug 7, 2024
3a8a072
Avoid sequential await
DarkLight1337 Aug 7, 2024
ef5327c
Fix type annotations based on test files
DarkLight1337 Aug 7, 2024
8a835cc
Properly handle `inputs["decoder_prompt"]=None`
DarkLight1337 Aug 7, 2024
e0024c2
Clean
DarkLight1337 Aug 7, 2024
76af172
Clean
DarkLight1337 Aug 7, 2024
5c16f2e
Fix incorrect decoder inputs in singleton case
DarkLight1337 Aug 7, 2024
e239ba9
Clean
DarkLight1337 Aug 7, 2024
4b0e3df
Move functions to a more appropriate place
DarkLight1337 Aug 7, 2024
53f7f50
Remove outdated comment
DarkLight1337 Aug 7, 2024
3afdbc5
Fix mismatch between hf and vllm output text
DarkLight1337 Aug 7, 2024
c61b01f
Factor out duplicate code
DarkLight1337 Aug 7, 2024
f8ed373
Factor out more duplicate code
DarkLight1337 Aug 7, 2024
a4df70a
Remove default values to avoid accidentally miss those arguments
DarkLight1337 Aug 7, 2024
5240bb3
Add test for serving encoder/decoder model with OpenAI server
DarkLight1337 Aug 7, 2024
d321c82
Use two type variables
DarkLight1337 Aug 7, 2024
31d82c6
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 7, 2024
931d1f6
Merge branch 'upstream' into inputs-parser
DarkLight1337 Aug 7, 2024
a06c67f
Merge branch 'upstream' into inputs-parser
DarkLight1337 Aug 7, 2024
9f64a05
Merge branch 'upstream' into inputs-parser
DarkLight1337 Aug 7, 2024
394a360
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 7, 2024
e4c5c21
Update error message
DarkLight1337 Aug 8, 2024
9bbafe1
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 8, 2024
f138f31
Merge branch 'inputs-parser' into typing-entrypoints
DarkLight1337 Aug 8, 2024
68fbf5a
Merge branch 'upstream' into inputs-parser
DarkLight1337 Aug 8, 2024
f912f25
Format
DarkLight1337 Aug 8, 2024
ed04adf
Merge branch 'inputs-parser' into typing-entrypoints
DarkLight1337 Aug 8, 2024
7da52f5
Fix circular import problem
DarkLight1337 Aug 8, 2024
f475a58
Fix incorrect assertion
DarkLight1337 Aug 8, 2024
f03b939
format
DarkLight1337 Aug 8, 2024
8291068
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 9, 2024
47baabd
Fix newly-introduced type errors
DarkLight1337 Aug 9, 2024
b8e69b7
fix
DarkLight1337 Aug 9, 2024
0a893a5
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 9, 2024
eb7312e
Simplify
DarkLight1337 Aug 9, 2024
e367e95
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 10, 2024
c924607
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 12, 2024
83fba8a
Avoid circular import
DarkLight1337 Aug 12, 2024
d08c826
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 13, 2024
4e3f014
Fix incorrect assertion
DarkLight1337 Aug 13, 2024
06114b6
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 14, 2024
c108f40
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 14, 2024
9066218
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 14, 2024
1e14f12
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 14, 2024
b9e8f00
Add type annotation
DarkLight1337 Aug 14, 2024
019981b
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 16, 2024
516aa3b
Clean up validation logic
DarkLight1337 Aug 16, 2024
f4af304
Update tests
DarkLight1337 Aug 16, 2024
1a81806
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 18, 2024
9cd8fb5
Fix type error
DarkLight1337 Aug 18, 2024
7e09fc8
Clean up parsing logic
DarkLight1337 Aug 18, 2024
b7dc954
format
DarkLight1337 Aug 18, 2024
78161b4
Remote quotes
DarkLight1337 Aug 18, 2024
0a9274a
Add fallback
DarkLight1337 Aug 18, 2024
1e89169
Update tests
DarkLight1337 Aug 18, 2024
e2ec43c
Move chat tests to the correct file
DarkLight1337 Aug 18, 2024
72145fe
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 19, 2024
1f9ea92
Update pydantic version
DarkLight1337 Aug 19, 2024
60b8aeb
Merge branch 'upstream' into typing-entrypoints
DarkLight1337 Aug 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/workflows/mypy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ jobs:
mypy vllm/core --follow-imports skip
mypy vllm/distributed --follow-imports skip
mypy vllm/engine --follow-imports skip
mypy vllm/entrypoints --follow-imports skip
mypy vllm/executor --follow-imports skip
mypy vllm/lora --follow-imports skip
mypy vllm/model_executor --follow-imports skip
Expand Down
2 changes: 1 addition & 1 deletion docs/requirements-docs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ myst-parser==2.0.0
sphinx-argparse==0.4.0

# packages to install to build the documentation
pydantic
pydantic >= 2.8
-f https://download.pytorch.org/whl/cpu
torch
py-cpuinfo
Expand Down
1 change: 0 additions & 1 deletion format.sh
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,6 @@ mypy vllm/attention --follow-imports skip
mypy vllm/core --follow-imports skip
mypy vllm/distributed --follow-imports skip
mypy vllm/engine --follow-imports skip
mypy vllm/entrypoints --follow-imports skip
mypy vllm/executor --follow-imports skip
mypy vllm/lora --follow-imports skip
mypy vllm/model_executor --follow-imports skip
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ files = [
"vllm/*.py",
"vllm/adapter_commons",
"vllm/assets",
"vllm/entrypoints",
"vllm/inputs",
"vllm/logging",
"vllm/multimodal",
Expand Down
2 changes: 1 addition & 1 deletion requirements-common.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ fastapi
aiohttp
openai >= 1.0 # Ensure modern openai package (ensure types module present)
uvicorn[standard]
pydantic >= 2.0 # Required for OpenAI server.
pydantic >= 2.8 # Required for OpenAI server.
pillow # Required for image processing
prometheus_client >= 0.18.0
prometheus-fastapi-instrumentator >= 7.0.0
Expand Down
84 changes: 83 additions & 1 deletion tests/entrypoints/openai/test_chat.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# imports for guided decoding tests
import json
import re
from typing import List
from typing import Dict, List, Optional

import jsonschema
import openai # use the official client for correctness check
Expand Down Expand Up @@ -174,6 +174,88 @@ async def test_too_many_chat_logprobs(client: openai.AsyncOpenAI,
assert message.content is not None and len(message.content) >= 0


@pytest.mark.asyncio
@pytest.mark.parametrize(
"model_name, prompt_logprobs",
[(MODEL_NAME, 1), (MODEL_NAME, 0), (MODEL_NAME, -1), (MODEL_NAME, None)],
)
async def test_prompt_logprobs_chat(client: openai.AsyncOpenAI,
model_name: str,
prompt_logprobs: Optional[int]):
params: Dict = {
"messages": [{
"role": "system",
"content": "You are a helpful assistant."
}, {
"role": "user",
"content": "Who won the world series in 2020?"
}, {
"role":
"assistant",
"content":
"The Los Angeles Dodgers won the World Series in 2020."
}, {
"role": "user",
"content": "Where was it played?"
}],
"model":
model_name
}

if prompt_logprobs is not None:
params["extra_body"] = {"prompt_logprobs": prompt_logprobs}

if prompt_logprobs is not None and prompt_logprobs < 0:
with pytest.raises(BadRequestError):
await client.chat.completions.create(**params)
else:
completion = await client.chat.completions.create(**params)
if prompt_logprobs is not None:
assert completion.prompt_logprobs is not None
assert len(completion.prompt_logprobs) > 0
else:
assert completion.prompt_logprobs is None


@pytest.mark.asyncio
@pytest.mark.parametrize(
"model_name",
[MODEL_NAME],
)
async def test_more_than_one_prompt_logprobs_chat(client: openai.AsyncOpenAI,
model_name: str):
params: Dict = {
"messages": [{
"role": "system",
"content": "You are a helpful assistant."
}, {
"role": "user",
"content": "Who won the world series in 2020?"
}, {
"role":
"assistant",
"content":
"The Los Angeles Dodgers won the World Series in 2020."
}, {
"role": "user",
"content": "Where was it played?"
}],
"model":
model_name,
"extra_body": {
"prompt_logprobs": 1
}
}

completion_1 = await client.chat.completions.create(**params)

params["extra_body"] = {"prompt_logprobs": 2}
completion_2 = await client.chat.completions.create(**params)

assert len(completion_1.prompt_logprobs[3]) == 1
assert len(completion_2.prompt_logprobs[3]) == 2


@pytest.mark.asyncio
@pytest.mark.parametrize(
"model_name",
Expand Down
101 changes: 5 additions & 96 deletions tests/entrypoints/openai/test_completion.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import re
import shutil
from tempfile import TemporaryDirectory
from typing import Dict, List
from typing import Dict, List, Optional

import jsonschema
import openai # use the official client for correctness check
Expand Down Expand Up @@ -268,118 +268,27 @@ async def test_too_many_completion_logprobs(client: openai.AsyncOpenAI,
assert len(completion.choices[0].text) >= 0


@pytest.mark.asyncio
@pytest.mark.parametrize(
"model_name, prompt_logprobs",
[(MODEL_NAME, 1), (MODEL_NAME, 0), (MODEL_NAME, -1), (MODEL_NAME, None)],
)
async def test_prompt_logprobs_chat(client: openai.AsyncOpenAI,
model_name: str, prompt_logprobs: int):
params: Dict = {
"messages": [{
"role": "system",
"content": "You are a helpful assistant."
}, {
"role": "user",
"content": "Who won the world series in 2020?"
}, {
"role":
"assistant",
"content":
"The Los Angeles Dodgers won the World Series in 2020."
}, {
"role": "user",
"content": "Where was it played?"
}],
"model":
model_name
}

if prompt_logprobs is not None:
params["extra_body"] = {"prompt_logprobs": prompt_logprobs}

if prompt_logprobs and prompt_logprobs < 0:
with pytest.raises(BadRequestError) as err_info:
await client.chat.completions.create(**params)
expected_err_string = (
"Error code: 400 - {'object': 'error', 'message': "
"'Prompt_logprobs set to invalid negative value: -1',"
" 'type': 'BadRequestError', 'param': None, 'code': 400}")
assert str(err_info.value) == expected_err_string
else:
completion = await client.chat.completions.create(**params)
if prompt_logprobs and prompt_logprobs > 0:
assert completion.prompt_logprobs is not None
assert len(completion.prompt_logprobs) > 0
else:
assert completion.prompt_logprobs is None


@pytest.mark.asyncio
@pytest.mark.parametrize(
"model_name",
[MODEL_NAME],
)
async def test_more_than_one_prompt_logprobs_chat(client: openai.AsyncOpenAI,
model_name: str):
params: Dict = {
"messages": [{
"role": "system",
"content": "You are a helpful assistant."
}, {
"role": "user",
"content": "Who won the world series in 2020?"
}, {
"role":
"assistant",
"content":
"The Los Angeles Dodgers won the World Series in 2020."
}, {
"role": "user",
"content": "Where was it played?"
}],
"model":
model_name,
"extra_body": {
"prompt_logprobs": 1
}
}

completion_1 = await client.chat.completions.create(**params)

params["extra_body"] = {"prompt_logprobs": 2}
completion_2 = await client.chat.completions.create(**params)

assert len(completion_1.prompt_logprobs[3]) == 1
assert len(completion_2.prompt_logprobs[3]) == 2


@pytest.mark.asyncio
@pytest.mark.parametrize("model_name, prompt_logprobs", [(MODEL_NAME, -1),
(MODEL_NAME, 0),
(MODEL_NAME, 1),
(MODEL_NAME, None)])
async def test_prompt_logprobs_completion(client: openai.AsyncOpenAI,
model_name: str,
prompt_logprobs: int):
prompt_logprobs: Optional[int]):
params: Dict = {
"prompt": ["A robot may not injure another robot", "My name is"],
"model": model_name,
}
if prompt_logprobs is not None:
params["extra_body"] = {"prompt_logprobs": prompt_logprobs}

if prompt_logprobs and prompt_logprobs < 0:
with pytest.raises(BadRequestError) as err_info:
if prompt_logprobs is not None and prompt_logprobs < 0:
with pytest.raises(BadRequestError):
await client.completions.create(**params)
expected_err_string = (
"Error code: 400 - {'object': 'error', 'message': "
"'Prompt_logprobs set to invalid negative value: -1',"
" 'type': 'BadRequestError', 'param': None, 'code': 400}")
assert str(err_info.value) == expected_err_string
else:
completion = await client.completions.create(**params)
if prompt_logprobs and prompt_logprobs > 0:
if prompt_logprobs is not None:
assert completion.choices[0].prompt_logprobs is not None
assert len(completion.choices[0].prompt_logprobs) > 0

Expand Down
8 changes: 4 additions & 4 deletions vllm/engine/async_llm_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
from typing import (AsyncGenerator, Callable, Dict, Iterable, List, Mapping,
Optional, Set, Tuple, Type, Union)

from transformers import PreTrainedTokenizer
from typing_extensions import assert_never

import vllm.envs as envs
Expand All @@ -28,6 +27,7 @@
from vllm.prompt_adapter.request import PromptAdapterRequest
from vllm.sampling_params import SamplingParams
from vllm.sequence import ExecuteModelRequest, SamplerOutput
from vllm.transformers_utils.tokenizer import AnyTokenizer
from vllm.usage.usage_lib import UsageContext
from vllm.utils import print_warning_once

Expand Down Expand Up @@ -308,8 +308,8 @@ async def _tokenize_prompt_async(
lora_request: Optional[LoRARequest],
) -> List[int]:
"""Async version of :meth:`_tokenize_prompt`."""
tokenizer = self.get_tokenizer_group("prompts must be None if "
"skip_tokenizer_init is True")
tokenizer = self.get_tokenizer_group(
missing_msg="prompts must be None if skip_tokenizer_init is True")

return await tokenizer.encode_async(request_id=request_id,
prompt=prompt,
Expand Down Expand Up @@ -652,7 +652,7 @@ def _error_callback(self, exc: Exception) -> None:
async def get_tokenizer(
self,
lora_request: Optional[LoRARequest] = None,
) -> "PreTrainedTokenizer":
) -> AnyTokenizer:
if self.engine_use_ray:
return await self.engine.get_tokenizer.remote( # type: ignore
lora_request)
Expand Down
31 changes: 21 additions & 10 deletions vllm/engine/llm_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
from typing import (TYPE_CHECKING, Any, ClassVar, Dict, Iterable, List,
Mapping, Optional)
from typing import Sequence as GenericSequence
from typing import Set, Tuple, Type, TypeVar, Union
from typing import Set, Tuple, Type, Union

from typing_extensions import assert_never
from typing_extensions import TypeVar, assert_never

import vllm.envs as envs
from vllm.config import (CacheConfig, DecodingConfig, DeviceConfig,
Expand Down Expand Up @@ -43,8 +43,9 @@
init_tracer)
from vllm.transformers_utils.config import try_get_generation_config
from vllm.transformers_utils.detokenizer import Detokenizer
from vllm.transformers_utils.tokenizer import AnyTokenizer
from vllm.transformers_utils.tokenizer_group import (
AnyTokenizer, BaseTokenizerGroup, init_tokenizer_from_configs)
BaseTokenizerGroup, init_tokenizer_from_configs)
from vllm.usage.usage_lib import (UsageContext, is_usage_stats_enabled,
usage_message)
from vllm.utils import Counter
Expand All @@ -67,6 +68,7 @@ def _load_generation_config_dict(model_config: ModelConfig) -> Dict[str, Any]:
return config.to_diff_dict()


_G = TypeVar("_G", bound=BaseTokenizerGroup, default=BaseTokenizerGroup)
_O = TypeVar("_O", RequestOutput, EmbeddingRequestOutput)

PromptComponents = Tuple[Optional[str], List[int],
Expand Down Expand Up @@ -493,12 +495,21 @@ def __del__(self):
"skip_tokenizer_init is True")

def get_tokenizer_group(
self,
fail_msg: str = MISSING_TOKENIZER_GROUP_MSG) -> BaseTokenizerGroup:
if self.tokenizer is None:
raise ValueError(fail_msg)
self,
group_type: Type[_G] = BaseTokenizerGroup,
*,
missing_msg: str = MISSING_TOKENIZER_GROUP_MSG,
) -> _G:
tokenizer_group = self.tokenizer

if tokenizer_group is None:
raise ValueError(missing_msg)
if not isinstance(tokenizer_group, group_type):
raise TypeError("Invalid type of tokenizer group. "
f"Expected type: {group_type}, but "
f"found type: {type(tokenizer_group)}")

return self.tokenizer
return tokenizer_group

def get_tokenizer(
self,
Expand Down Expand Up @@ -693,8 +704,8 @@ def _tokenize_prompt(
* prompt token ids
'''

tokenizer = self.get_tokenizer_group("prompts must be None if "
"skip_tokenizer_init is True")
tokenizer = self.get_tokenizer_group(
missing_msg="prompts must be None if skip_tokenizer_init is True")

return tokenizer.encode(request_id=request_id,
prompt=prompt,
Expand Down
Loading
Loading