[Bug]: Request Cancelation w/ Scheduler Steps Set Causes K8s Pod Restart #7877

sam-h-bean · 2024-08-26T18:36:32Z

Your current environment

I am running K8s vLLM OpenAI compatible server on version 0.5.5. The client was a locust load test.

🐛 Describe the bug

INFO:     10.65.62.6:37396 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/cors.py", line 85, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 754, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 774, in app
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 295, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 74, in app
    response = await f(request)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 278, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
  File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/api_server.py", line 271, in create_chat_completion
    generator = await openai_serving_chat.create_chat_completion(
  File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/serving_chat.py", line 188, in create_chat_completion
    return await self.chat_completion_full_generator(
  File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/serving_chat.py", line 438, in chat_completion_full_generator
    async for res in result_generator:
  File "/usr/local/lib/python3.10/dist-packages/vllm/utils.py", line 430, in iterate_with_cancellation
    item = await awaits[0]
  File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/rpc/client.py", line 416, in generate
    raise request_output
AssertionError: expected running sequences
CRITICAL 08-26 11:25:19 launcher.py:98] AsyncLLMEngine is already dead, terminating server process
INFO:     10.65.62.6:37412 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
CRITICAL 08-26 11:25:19 launcher.py:98] AsyncLLMEngine is already dead, terminating server process
INFO:     10.65.62.6:37424 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [1]

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

comaniac · 2024-08-26T18:52:55Z

The assertion is in https://github.com/vllm-project/vllm/blob/main/vllm/engine/output_processor/multi_step.py#L73. I've added this to the multi-step issue tracker.

Also cc @SolitaryThinker @alexm-neuralmagic

robertgshaw2-neuralmagic · 2024-08-31T17:45:05Z

Fixed by: #8059

sam-h-bean added the bug Something isn't working label Aug 26, 2024

comaniac mentioned this issue Aug 26, 2024

[Tracking issue] [Help wanted]: Multi-step scheduling follow-ups #7528

Open

17 tasks

robertgshaw2-neuralmagic mentioned this issue Aug 31, 2024

[BugFix][Core] Multistep Fix Crash on Request Cancellation #8059

Merged

robertgshaw2-neuralmagic closed this as completed in #8059 Aug 31, 2024

github-staff deleted a comment from Lxx-c Oct 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Request Cancelation w/ Scheduler Steps Set Causes K8s Pod Restart #7877

[Bug]: Request Cancelation w/ Scheduler Steps Set Causes K8s Pod Restart #7877

sam-h-bean commented Aug 26, 2024

comaniac commented Aug 26, 2024

robertgshaw2-neuralmagic commented Aug 31, 2024

[Bug]: Request Cancelation w/ Scheduler Steps Set Causes K8s Pod Restart #7877

[Bug]: Request Cancelation w/ Scheduler Steps Set Causes K8s Pod Restart #7877

Comments

sam-h-bean commented Aug 26, 2024

Your current environment

🐛 Describe the bug

Before submitting a new issue...

comaniac commented Aug 26, 2024

robertgshaw2-neuralmagic commented Aug 31, 2024