Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request Cancellation #4818

Closed
sdpmas opened this issue Aug 28, 2022 · 4 comments
Closed

Request Cancellation #4818

sdpmas opened this issue Aug 28, 2022 · 4 comments
Labels
question Further information is requested

Comments

@sdpmas
Copy link

sdpmas commented Aug 28, 2022

Is your feature request related to a problem? Please describe.
Is there a way to cancel a request? The problem arises a lot when we are processing long requests and the user often cancels the request mid-way. For example, while doing text completion with. GPT-like model, the user sometimes doesn't want the completion.

Describe the solution you'd like
It'd be cool if we could communicate w. triton server to cancel requests whenever we detect that the response is of no use.

@rmccorm4
Copy link
Contributor

Hi @sdpmas ,

Currently, there is no way to cancel a request once it has been scheduled/placed in the queue on the server side. Once the request is scheduled, it's up to the backend to do the right thing.

You may be able to use some custom logic to determine if that request should just return an empty response / not waste more time computing the response. But this would require some custom backend logic and metadata to do so on your end.

See these similar issues:

@rmccorm4 rmccorm4 added the question Further information is requested label Aug 29, 2022
@dyastremsky
Copy link
Contributor

Closing issue due to lack of activity. Please let us know if you need this issue reopened for follow-up.

@mkhludnev
Copy link

Hi,
does https://github.com/triton-inference-server/client#request-cancellation mean Triton is able to cancel request despite of whats said here?

@dyastremsky
Copy link
Contributor

@mkhludnev Yes. That documentation is more recent. That feature was introduced in 23.10 (i.e. in the release around October 2023).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Development

No branches or pull requests

4 participants