document architectural options for Invoker deployment #110

dgrove-oss · 2017-12-05T21:38:34Z

This issue is to document options for deploying the invoker subsystem for OpenWhisk. The topic has been discussed in various venues before, most recently in a review of #107 by @stigsb.

The key choice to make when deploying invokers is what implementation of the ContainerFactoryProvider SPI to use. There are currently two approaches being used by downsteam consumers of this project:

DockerContainerFactory

In this approach, the Kubernetes scheduler is only used to deploy the OpenWhisk "control plane". All of the user action containers are created, managed, and destroyed by the invoker using docker on the Kubernetes worker node. For this approach to work well, it is essential that there is exactly 1 invoker pod per worker node that is intended for user function execution. Using a Daemonset for the invokers is a natural fit, since the nodes intended for the invoker to use will be fairly static and can be labeled accordingly. Capacity is added/removed from the system by adding/removing worker nodes to the cluster and/or adding/removing the invoker label to the worker nodes.

This approach has the advantage of supporting low latency suspend/resume operations, but gives up some of the advantages of running on Kubernetes because it keeps the Kubernetes scheduler in the dark and forces a relatively static allocation of worker nodes to OpenWhisk invokers.

KubernetesContainerFactory

In this approach, the Kubernetes scheduler is used for all container operations: both control plane and user containers are created, managed, and destroyed by Kubernetes. In this approach, it is highly likely that the number of invoker pods will be much smaller than the number of worker nodes in the cluster. Furthermore, it is likely that some form of autoscaling could be applied to dynamically vary the number of invokers to match system load (although #84 is needed to really make autoscaling work well).

This approach allows better sharing of compute resources between OpenWhisk and other uses of the Kubernetes cluster. However, the current KubernetesContainer (https://github.com/projectodd/incubator-openwhisk/blob/d2eb77aac212fb9970f3c9f914bf5863dcbefe50/core/invoker/src/main/scala/whisk/core/containerpool/kubernetes/KubernetesContainer.scala#L105 and https://github.com/projectodd/incubator-openwhisk/blob/d2eb77aac212fb9970f3c9f914bf5863dcbefe50/core/invoker/src/main/scala/whisk/core/containerpool/kubernetes/KubernetesContainer.scala#L108) does not actually implement the suspend/resume actions, so cannot be used if suspension of warm containers is a deployment requirement.

timboldt · 2017-12-06T21:25:17Z

DockerContainerFactory has the additional weakness that it complicates the security configuration, e.g. consider how you would implement a Calico policy to prevent lambda containers from accessing the control plane.

timboldt · 2017-12-06T21:40:01Z

As for suspend/resume, it seems unlikely that it will ever be implemented in K8S or Mesos, for reasonably good reasons. Perhaps there is a better way? For example, an expensive container like Java could be pre-warmed in a generic state. Or, if there are artifacts to compile, do it beforehand and save the intermediaries? Or, crazy idea, maybe support process hibernation like https://criu.org/. (I guess suspend/resume needs its own topic.)

tysonnorris · 2017-12-07T00:51:08Z

@timboldt It may be possible to use a custom executor with the mesos framework, such that the framework supports a pause/resume message being sent to the executor for a particular task. In our case, we haven't really considered this yet, so it may also be in the crazy idea category.

One other aspect of ContainerFactory that we anticipate (in the future) is heterogeneous clusters where actions that require different types of resources will only be scheduled to hosts that meet those requirements e.g. GPU.

dgrove-oss changed the title ~~architectural options for Invoker deployment~~ document architectural options for Invoker deployment Jan 4, 2018

dgrove-oss self-assigned this Feb 9, 2018

sss0350 mentioned this issue Feb 12, 2018

Does a function container running within the same POD of iron-function engine on Kubernetes? iron-io/functions#670

Open

dgrove-oss mentioned this issue Mar 16, 2018

config files to use kubernetes container pool and invoker-agent #155

Merged

rabbah closed this as completed in #155 Mar 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

document architectural options for Invoker deployment #110

document architectural options for Invoker deployment #110

dgrove-oss commented Dec 5, 2017

timboldt commented Dec 6, 2017

timboldt commented Dec 6, 2017

tysonnorris commented Dec 7, 2017

document architectural options for Invoker deployment #110

document architectural options for Invoker deployment #110

Comments

dgrove-oss commented Dec 5, 2017

DockerContainerFactory

KubernetesContainerFactory

timboldt commented Dec 6, 2017

timboldt commented Dec 6, 2017

tysonnorris commented Dec 7, 2017