Skip to content

Commit

Permalink
sidecar: Add clarifications from Istio devs
Browse files Browse the repository at this point in the history
In the previous PR, istio devs commented that some things were not
accurate. This commit just updates the text to (hopefully) correctly
reflect it now.

Removed the paragraph about this removing the need for an initContainer
due to comment here:
	kubernetes#1913 (comment)

I thought it was an okay to insert the iptables rules within the sidecar
proxy container, but it is not okay as that requires more permissions
(capabilities) on the sidecar proxy container which is not considered
accetable by Istio devs.

Signed-off-by: Rodrigo Campos <[email protected]>
  • Loading branch information
rata committed Sep 10, 2020
1 parent 58590b7 commit 0c5b731
Showing 1 changed file with 21 additions and 32 deletions.
53 changes: 21 additions & 32 deletions keps/sig-node/0753-sidecarcontainers.md
Original file line number Diff line number Diff line change
Expand Up @@ -275,23 +275,18 @@ Linkerd and Istio), need to do several hacks to have the basic
functionality. These are explained in detail in the
[alternatives](#alternatives) section. Nonetheless, here is a quick highlight
of some of the things some service mesh currently need to do:

* Recommend users to delay starting their apps by using a script to wait for
the service mesh to be ready. The goal of a service mesh to augment the app
functionality without modifying it, that goal is lost in this case.
* To guarantee that traffic goes via the services mesh, an initContainer is
added to blackhole traffic until the service mesh containers are up. This way,
other containers that might be started before the service mesh container can't
use the network until the service mesh container is started and ready. A side
effect is that traffic is blackholed until the service mesh is up and in a
ready state.
* If they don't delay starting their application, the network connection they
try to establish are blacklisted until the service mesh container is up.
* Use preStop hooks with a "sleep infinity" to make sure the service mesh
doesn't terminate before other containers that might be serving requests.

The auto-inject of initContainer [has caused bugs][linkerd-bug-init-last], as it
competes with other tools auto-injecting a container to be run last too.

[linkerd-bug-init-last]: https://github.com/linkerd/linkerd2/issues/4758#issuecomment-658457737
This KEP adds guarantees to startup/shutdown behavior, so _those_ problems will
be solved for service mesh. However, service mesh do have other problems that
are out of scope for this KEP, e.g. enable service mesh before initContainers
are started.

### Problems: Coupling infrastructure with applications

Expand All @@ -308,9 +303,6 @@ used to completely remove the need for such an initContainer. But this
alternative requires that nodes have the CNI plugin installed, effectively
coupling the service mesh app with the infrastructure.

This KEP removes the need for a service mesh to use either an initContainer or a
CNI plugin: just guarantee that the sidecar container can be started first.

While in this specific example the CNI plugin has some benefits too (removes the
need for some capabilities in the pod) and might be worth pursuing, it is used
here as an example to show thee possibility of coupling apps with
Expand Down Expand Up @@ -779,11 +771,10 @@ startup, as explained in the next section.

#### Service mesh or metrics sidecars

Let app container be the main app that just has the service mesh extra container
in the pod.

Service mesh, today, have to do the following workarounds due to lack of startup
ordering:
Service mesh, today, have to do the following workarounds due to lack of
startup/shutdown ordering:
* Recommend users to delay starting their apps by using a script to wait for
the service mesh to be ready.
* Blackhole all the traffic until service mesh container is up (usually using
an initContainer for this)
* Some trickery (sleep preStop hooks or some alternative) to not be killed
Expand All @@ -792,28 +783,26 @@ ordering:

This means that if the app container is started before the service mesh is
started and ready, all traffic will be blackholed and the app needs to retry.
Once the service mesh container is ready, traffic will be allowed.
Once the service mesh container is ready, traffic will be allowed. A similar
problem happens for shutdown: if the service mesh container is killed first, the
network is down for the rest of the containers in the pod.

This has another major disadvantage: several apps crash if traffic is blackholed
during startup (common in some rails middleware, for example) and have to resort
to some kind of workaround, like [this one][linkerd-wait] to wait. This makes
also service mesh miss their goal of augmenting containers functionality without
modifying the main application.

Istio has an alternative to the initContainer hack. Istio [has an
option][istio-cni-opt] to integrate with CNI and inject the blackhole from there
instead of using the initContainer. In that case, it will do (just c&p from the
link, in case it breaks in the future):

> By default Istio injects an initContainer, istio-init, in pods deployed in the mesh. The istio-init container sets up the pod network traffic redirection to/from the Istio sidecar proxy. This requires the user or service-account deploying pods to the mesh to have sufficient Kubernetes RBAC permissions to deploy containers with the NET_ADMIN and NET_RAW capabilities. Requiring Istio users to have elevated Kubernetes RBAC permissions is problematic for some organizations’ security compliance
> ...
> The Istio CNI plugin performs the Istio mesh pod traffic redirection in the Kubernetes pod lifecycle’s network setup phase, thereby removing the requirement for the NET_ADMIN and NET_RAW capabilities for users deploying pods into the Istio mesh. The Istio CNI plugin replaces the functionality provided by the istio-init container.
This KEP addresses these 3 problems just listed when initContainer are not used
by the application. If initContainers are used, the first and the last problem
are solved only. In other words, traffic might still be blackholed for
initContainers that run after the service mesh iptables rules are inserted.

In other words, Istio has an alternative to configure the traffic blockhole
without an initContainer. But the other problems and hacks mentioned remain,
though.
Such rules are usually inserted as an initContainer (trying to run last, to
avoid blackholing traffic to other initContainers) or alternatively, in the case
of Istio, using a [CNI plugin][istio-cni-opt]. When using the CNI plugin all
traffic from initContainers will be blackholed.

[linkerd-last-container]: https://github.com/linkerd/linkerd2/issues/4758#issuecomment-658457737
[istio-cni-opt]: https://istio.io/latest/docs/setup/additional-setup/cni/
[linkerd-wait]: https://github.com/olix0r/linkerd-await

Expand Down

0 comments on commit 0c5b731

Please sign in to comment.