-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explore how to enable Istio sidecar on charm pods #360
Comments
Thank you for reporting us your feedback! The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5166.
|
@kimwnasptd suggested doing something similar to observability does here with resource limits. iiuc, in that case each charm that wanted an istio sidecar would modify its own Juju-created StatefulSet to add the istio sidecar annotation, which would trigger a redeploy of the underlying charm(/workload) pod. Some outstanding questions on this are:
edit: by annotations, I think we mean labels |
What about other possible solutions? Maybe:
|
Tried out option (1) (the "charm adds the istio label to its own statefulset" solution) using this charm: charm.py#!/usr/bin/env python3
import logging
import time
from typing import List, Optional
import yaml
from lightkube import Client
from lightkube.resources.apps_v1 import StatefulSet
from ops.charm import CharmBase
from ops.main import main
from ops.model import ActiveStatus
logger = logging.getLogger(__name__)
class IstioSidecarInjectorLabelTest(CharmBase):
"""Charm for testing whether a charm can self-inject an istio sidecar by changing its labels."""
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
logger.info("IstioSidecarInjectorLabelTest.__init__")
self.framework.observe(self.on.config_changed, self._echo_before_event)
self.framework.observe(self.on.config_changed, self._add_istio_injection_label_to_statefulset)
self.framework.observe(self.on.config_changed, self._echo_after_event)
def _echo_before_event(self, event):
"""Echoes before the event is handled."""
logger.info("IstioSidecarInjectorLabelTest._echo_before_event")
logger.info(f"Event: {event}")
def _echo_after_event(self, event):
"""Echoes when an event is handled."""
logger.info("IstioSidecarInjectorLabelTest._echo_after_event")
logger.info(f"Event: {event}")
def _add_istio_injection_label_to_statefulset(self, event):
"""Adds the istio-injection=true label to the statefulset for this charm."""
logger.info("IstioSidecarInjectorLabelTest._add_istio_injection_label_to_statefulset")
sleeptime = 10
logger.info(f"Sleeping {sleeptime} seconds")
time.sleep(sleeptime)
logger.info("Sleep complete")
lightkube_client = Client()
patch = {
"spec": {
"template": {
"metadata": {"labels": {"sidecar.istio.io/inject": "true"}}
}
}
}
logger.info("patching statefulset")
lightkube_client.patch(
StatefulSet, name=self.model.app.name, namespace=self.model.name, obj=patch
)
logger.info("patched statefulset")
sleeptime = 10
logger.info(f"Sleeping {sleeptime} seconds")
time.sleep(sleeptime)
logger.info("Sleep complete")
if __name__ == "__main__":
main(IstioSidecarInjectorLabelTest) On
Deploying it (
So this does appear to work |
Is there a race condition with the above solution? Since doing A hacky solution could be that the istio-label code above also checks to see if the new pod has an istio sidecar, but that feels ugly. And even if we spot the sidecar is missing, what do we do? At best we can defer and hope some other event triggers us again soon |
Yes, after testing this the race condition does exist - if the label-injection occurs before istio has created its webhooks, the charm pods will be restarted but with the istio label but will not receive the istio sidecar |
Testing out option (4) (manual sidecar injection): Taking this sleep.yamlapiVersion: v1
kind: ServiceAccount
metadata:
name: sleep
---
apiVersion: v1
kind: Service
metadata:
name: sleep
labels:
app: sleep
spec:
ports:
- port: 80
name: http
selector:
app: sleep
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: sleep
spec:
replicas: 1
selector:
matchLabels:
app: sleep
serviceName: sleep
template:
metadata:
labels:
app: sleep
spec:
terminationGracePeriodSeconds: 0
serviceAccountName: sleep
containers:
- command:
- /bin/sleep
- infinity
image: curlimages/curl
imagePullPolicy: IfNotPresent
name: sleep
volumeMounts:
- mountPath: /etc/sleep/tls
name: secret-volume
volumes:
- name: secret-volume
secret:
secretName: sleep-secret
optional: true and running that through sleep_with_istio_injected.yamlapiVersion: v1
kind: ServiceAccount
metadata:
name: sleep
---
apiVersion: v1
kind: Service
metadata:
name: sleep
labels:
app: sleep
spec:
ports:
- port: 80
name: http
selector:
app: sleep
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
creationTimestamp: null
name: sleep
spec:
replicas: 1
selector:
matchLabels:
app: sleep
serviceName: sleep
template:
metadata:
annotations:
kubectl.kubernetes.io/default-container: sleep
kubectl.kubernetes.io/default-logs-container: sleep
prometheus.io/path: /stats/prometheus
prometheus.io/port: "15020"
prometheus.io/scrape: "true"
sidecar.istio.io/status: '{"initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["workload-socket","credential-socket","workload-certs","istio-envoy","istio-data","istio-podinfo","istio-token","istiod-ca-cert"],"imagePullSecrets":null,"revision":"default"}'
creationTimestamp: null
labels:
app: sleep
security.istio.io/tlsMode: istio
service.istio.io/canonical-name: sleep
service.istio.io/canonical-revision: latest
spec:
containers:
- command:
- /bin/sleep
- infinity
image: curlimages/curl
imagePullPolicy: IfNotPresent
name: sleep
resources: {}
volumeMounts:
- mountPath: /etc/sleep/tls
name: secret-volume
- args:
- proxy
- sidecar
- --domain
- $(POD_NAMESPACE).svc.cluster.local
- --proxyLogLevel=warning
- --proxyComponentLogLevel=misc:error
- --log_output_level=default:info
- --concurrency
- "2"
env:
- name: JWT_POLICY
value: third-party-jwt
- name: PILOT_CERT_PROVIDER
value: istiod
- name: CA_ADDR
value: istiod.kubeflow.svc:15012
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: INSTANCE_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: SERVICE_ACCOUNT
valueFrom:
fieldRef:
fieldPath: spec.serviceAccountName
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: PROXY_CONFIG
value: |
{"discoveryAddress":"istiod.kubeflow.svc:15012","tracing":{"zipkin":{"address":"zipkin.kubeflow:9411"}}}
- name: ISTIO_META_POD_PORTS
value: |-
[
]
- name: ISTIO_META_APP_CONTAINERS
value: sleep
- name: ISTIO_META_CLUSTER_ID
value: Kubernetes
- name: ISTIO_META_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: ISTIO_META_INTERCEPTION_MODE
value: REDIRECT
- name: ISTIO_META_MESH_ID
value: cluster.local
- name: TRUST_DOMAIN
value: cluster.local
image: docker.io/istio/proxyv2:1.17.3
name: istio-proxy
ports:
- containerPort: 15090
name: http-envoy-prom
protocol: TCP
readinessProbe:
failureThreshold: 30
httpGet:
path: /healthz/ready
port: 15021
initialDelaySeconds: 1
periodSeconds: 2
timeoutSeconds: 3
resources:
limits:
cpu: "2"
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
readOnlyRootFilesystem: true
runAsGroup: 1337
runAsNonRoot: true
runAsUser: 1337
volumeMounts:
- mountPath: /var/run/secrets/workload-spiffe-uds
name: workload-socket
- mountPath: /var/run/secrets/credential-uds
name: credential-socket
- mountPath: /var/run/secrets/workload-spiffe-credentials
name: workload-certs
- mountPath: /var/run/secrets/istio
name: istiod-ca-cert
- mountPath: /var/lib/istio/data
name: istio-data
- mountPath: /etc/istio/proxy
name: istio-envoy
- mountPath: /var/run/secrets/tokens
name: istio-token
- mountPath: /etc/istio/pod
name: istio-podinfo
initContainers:
- args:
- istio-iptables
- -p
- "15001"
- -z
- "15006"
- -u
- "1337"
- -m
- REDIRECT
- -i
- '*'
- -x
- ""
- -b
- '*'
- -d
- 15090,15021,15020
- --log_output_level=default:info
image: docker.io/istio/proxyv2:1.17.3
name: istio-init
resources:
limits:
cpu: "2"
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_ADMIN
- NET_RAW
drop:
- ALL
privileged: false
readOnlyRootFilesystem: false
runAsGroup: 0
runAsNonRoot: false
runAsUser: 0
serviceAccountName: sleep
terminationGracePeriodSeconds: 0
volumes:
- name: workload-socket
- name: credential-socket
- name: workload-certs
- emptyDir:
medium: Memory
name: istio-envoy
- emptyDir: {}
name: istio-data
- downwardAPI:
items:
- fieldRef:
fieldPath: metadata.labels
path: labels
- fieldRef:
fieldPath: metadata.annotations
path: annotations
name: istio-podinfo
- name: istio-token
projected:
sources:
- serviceAccountToken:
audience: istio-ca
expirationSeconds: 43200
path: istio-token
- configMap:
name: istio-ca-root-cert
name: istiod-ca-cert
- name: secret-volume
secret:
optional: true
secretName: sleep-secret
updateStrategy: {}
status:
replicas: 0 where the differences are essentially:
So to manually inject the sidecar as proposed, we'd need to replicate all of the above differences inside the charm. Addressing each separately:
Overall, this might be doable, but it does sound complicated. And this has a similar race condition as solution (1). If we haven't deployed istio yet, how do we get the configurations for the istio sidecar? Maybe they're deterministic and we can specify them statically, but then do we bake all that configuration into the code? If we do know the configurations, we'd still have to test what happens when a sidecar starts but the istiod it depends on is not available - does it gracefully handle this until istio is alive? |
Looking at option (3) (use namespace-level injections by labelling the namespace with Implementation of this option is much easier for most charms. By annotating the kubeflow namespace for istio injection, we no longer need to change any of the charms themselves. We do, however, face a similar race condition to option (1). If we do: juju add-model kubeflow
kubectl label namespace kubeflow istio-injection=enabled --overwrite
juju deploy kubeflow the kubeflow namespace will have the correct istio labels, but until the istio-pilot charm has deployed istio's webhooks the pods created will not get istio sidecars. So if you have the charms deploy in this order, you might see:
So we still have the issue where we need istio deployed before everything else. Another issue with this solution is that we could accidentally give juju's model operator an istio sidecar. iiuc this by itself shouldn't hurt the model operator alone, but if we also had a |
Given the race conditions affecting (1) and (3), I wonder if it would make sense to unbundle istio from the rest of Kubeflow. This is a large change, but would have benefits. Maybe we have:
Then the kubeflow bundle would include juju add-model istio-system
juju deploy istio # (the daemon)
# wait
juju add-model kubeflow
juju deploy kubeflow # (includes istio-integrator) |
Notes from an internal meeting here Summary is that we have some design decisions left to make - will report back here when we decide them |
This is also affected by whether we are using the istio CNI plugin or not. For example, looking at the output for |
Context
To support improved network isolation of Charmed Kubeflow components, investigate how we can attach an istio sidecar to charm pods.
What needs to get done
Definition of Done
The text was updated successfully, but these errors were encountered: