-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[serve] max_ongoing_requests
limited by max_concurrency
in actor
#47681
Comments
I think the issue is caused by the limitation of ray/python/ray/serve/_private/config.py Line 537 in 1c80db5
and modify the ray/python/ray/serve/schema.py Line 190 in 1c80db5
But I think a better way is to align the |
…oing_requests` (#47681) (#48274) ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> This PR modifies the actor_options used when deploying replicas. Deployment will use the configured `max_ongoing_requests` attribute of the deployment config as the replica's `max_concurrency` if the concurrency is not explicitly set. This is to prevent replica's `max_concurrency` from capping `max_ongoing_requests`. ## Related issue number <!-- For example: "Closes #1234" --> Closes #47681 Signed-off-by: akyang-anyscale <[email protected]>
…oing_requests` (ray-project#47681) (ray-project#48274) ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> This PR modifies the actor_options used when deploying replicas. Deployment will use the configured `max_ongoing_requests` attribute of the deployment config as the replica's `max_concurrency` if the concurrency is not explicitly set. This is to prevent replica's `max_concurrency` from capping `max_ongoing_requests`. ## Related issue number <!-- For example: "Closes ray-project#1234" --> Closes ray-project#47681 Signed-off-by: akyang-anyscale <[email protected]>
…oing_requests` (ray-project#47681) (ray-project#48274) ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> This PR modifies the actor_options used when deploying replicas. Deployment will use the configured `max_ongoing_requests` attribute of the deployment config as the replica's `max_concurrency` if the concurrency is not explicitly set. This is to prevent replica's `max_concurrency` from capping `max_ongoing_requests`. ## Related issue number <!-- For example: "Closes ray-project#1234" --> Closes ray-project#47681 Signed-off-by: akyang-anyscale <[email protected]>
…oing_requests` (ray-project#47681) (ray-project#48274) ## Why are these changes needed? <!-- Please give a short summary of the change and the problem this solves. --> This PR modifies the actor_options used when deploying replicas. Deployment will use the configured `max_ongoing_requests` attribute of the deployment config as the replica's `max_concurrency` if the concurrency is not explicitly set. This is to prevent replica's `max_concurrency` from capping `max_ongoing_requests`. ## Related issue number <!-- For example: "Closes ray-project#1234" --> Closes ray-project#47681 Signed-off-by: akyang-anyscale <[email protected]> Signed-off-by: mohitjain2504 <[email protected]>
What happened + What you expected to happen
max_ongoing_requests
params in @serve.deployment isn't useful when it larger than 1000.max_ongoing_requests
is useful even it larger than 1000.Versions / Dependencies
Reproduction script
Reproducible Script:
Expect Output:
Actual Output:
Issue Severity
Low: It annoys or frustrates me.
The text was updated successfully, but these errors were encountered: