-
Notifications
You must be signed in to change notification settings - Fork 58
Set timeout on a function #300
Comments
There are several considerations that need to be made and I'm open to feedback from the team and whoever may be interested. Our current implementation for function invocation relies on a small web service in each function container that accepts calls from the FaaS to invoke a function. This means that for the life of the pod/container a process is running that is listening for http requests and invokes the desired function in response to those requests. This presents a problem for terminating functions that may have opened resources that need to be released to terminate cleanly. I see two potential ways to address this problem. The first relies on function code that 'listens' for interrupts which are triggered by our base image implementations after the desired timeout. The downside is that the user must explicitly code their function to listen for these interrupts, clean up their resources and send an error. This is a lot of overhead for functions that should be focused on their business logic. The other way to make sure that resources are cleaned up in the event of a timeout would be to kill the whole process. That could mean killing and restarting the web service or even killing and restarting the pod. The problem with this approach is that we need to guarantee that the FaaS implementation will never try to invoke the same function more than once currently on the same pod or we will end up terminating a function unrelated to the one that hit the timeout. One final, unsavory, solution could be to ignore the resource clean up. If a function is run many many times and hits the timeout on most of those runs without cleaning up resources then the pod could become slow and unreliable. Eventually the healthz checks will fail and the pod will be terminated and restarted. If we combine this with a regular pruning check for all the pods we might be able to get away with sticking our heads in the sand on this issue. How does everyone feel about these options? |
I'd suggest we pass a timeout (or, as found in Go context, a deadline, which is an absolute timestamp) in the context and expect the function to respect it. In function-server (in the base-images), we can watch if the timeouts are being respected and set the pod's health status accordingly. Unhealthy instances will get replaced. |
Let's use timeouts in our CLI and API, and deadlines - internally - in the function context. See dispatchframework/java-base-image#14 (comment) |
This works has been completed in all of the base images and within Dispatch itself. Timeouts can now be set at function creation. |
Detailed Description
Sometimes a function run might hang up, so it would be nice if we can set a timeout in the function definition, and then if a run exceeds it, to be terminated.
The text was updated successfully, but these errors were encountered: