Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Request Cancelation w/ Scheduler Steps Set Causes K8s Pod Restart #7877

Closed
1 task done
sam-h-bean opened this issue Aug 26, 2024 · 2 comments · Fixed by #8059
Closed
1 task done

[Bug]: Request Cancelation w/ Scheduler Steps Set Causes K8s Pod Restart #7877

sam-h-bean opened this issue Aug 26, 2024 · 2 comments · Fixed by #8059
Labels
bug Something isn't working

Comments

@sam-h-bean
Copy link

Your current environment

I am running K8s vLLM OpenAI compatible server on version 0.5.5. The client was a locust load test.

🐛 Describe the bug

INFO:     10.65.62.6:37396 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/cors.py", line 85, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 754, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 774, in app
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 295, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 74, in app
    response = await f(request)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 278, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
  File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/api_server.py", line 271, in create_chat_completion
    generator = await openai_serving_chat.create_chat_completion(
  File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/serving_chat.py", line 188, in create_chat_completion
    return await self.chat_completion_full_generator(
  File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/serving_chat.py", line 438, in chat_completion_full_generator
    async for res in result_generator:
  File "/usr/local/lib/python3.10/dist-packages/vllm/utils.py", line 430, in iterate_with_cancellation
    item = await awaits[0]
  File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/rpc/client.py", line 416, in generate
    raise request_output
AssertionError: expected running sequences
CRITICAL 08-26 11:25:19 launcher.py:98] AsyncLLMEngine is already dead, terminating server process
INFO:     10.65.62.6:37412 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
CRITICAL 08-26 11:25:19 launcher.py:98] AsyncLLMEngine is already dead, terminating server process
INFO:     10.65.62.6:37424 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [1]

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@comaniac
Copy link
Collaborator

The assertion is in https://github.com/vllm-project/vllm/blob/main/vllm/engine/output_processor/multi_step.py#L73. I've added this to the multi-step issue tracker.

Also cc @SolitaryThinker @alexm-neuralmagic

@robertgshaw2-neuralmagic
Copy link
Collaborator

Fixed by: #8059

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants
@comaniac @sam-h-bean @robertgshaw2-neuralmagic and others