-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request Cancellation #4818
Comments
Hi @sdpmas , Currently, there is no way to cancel a request once it has been scheduled/placed in the queue on the server side. Once the request is scheduled, it's up to the backend to do the right thing. You may be able to use some custom logic to determine if that request should just return an empty response / not waste more time computing the response. But this would require some custom backend logic and metadata to do so on your end. See these similar issues: |
Closing issue due to lack of activity. Please let us know if you need this issue reopened for follow-up. |
Hi, |
@mkhludnev Yes. That documentation is more recent. That feature was introduced in 23.10 (i.e. in the release around October 2023). |
Is your feature request related to a problem? Please describe.
Is there a way to cancel a request? The problem arises a lot when we are processing long requests and the user often cancels the request mid-way. For example, while doing text completion with. GPT-like model, the user sometimes doesn't want the completion.
Describe the solution you'd like
It'd be cool if we could communicate w. triton server to cancel requests whenever we detect that the response is of no use.
The text was updated successfully, but these errors were encountered: