diff --git a/content/en/docs/ops/ambient/usage/index.md b/content/en/docs/ops/ambient/usage/_index.md similarity index 60% rename from content/en/docs/ops/ambient/usage/index.md rename to content/en/docs/ops/ambient/usage/_index.md index 7d8212a8d4c4b..5515475f0b10c 100644 --- a/content/en/docs/ops/ambient/usage/index.md +++ b/content/en/docs/ops/ambient/usage/_index.md @@ -1,9 +1,8 @@ --- -title: Ambient Mesh Usage Guide +title: Ambient Mesh User Guides description: How to use ambient mesh. weight: 2 owner: istio/wg-networking-maintainers test: n/a --- - -This page is under construction. +This page is under construction .. . diff --git a/content/en/docs/ops/ambient/usage/waypoint/index.md b/content/en/docs/ops/ambient/usage/waypoint/index.md new file mode 100644 index 0000000000000..043e526b07bbd --- /dev/null +++ b/content/en/docs/ops/ambient/usage/waypoint/index.md @@ -0,0 +1,8 @@ +--- +title: Waypoint and L7 networking +description: User guide for Istio Ambient L7 services using Waypoint proxy. +weight: 2 +owner: istio/wg-networking-maintainers +test: n/a +--- + diff --git a/content/en/docs/ops/ambient/usage/ztunnel/hbone-packet.png b/content/en/docs/ops/ambient/usage/ztunnel/hbone-packet.png new file mode 100644 index 0000000000000..866a0f74508d5 Binary files /dev/null and b/content/en/docs/ops/ambient/usage/ztunnel/hbone-packet.png differ diff --git a/content/en/docs/ops/ambient/usage/ztunnel/index.md b/content/en/docs/ops/ambient/usage/ztunnel/index.md new file mode 100644 index 0000000000000..48eb9633ff85c --- /dev/null +++ b/content/en/docs/ops/ambient/usage/ztunnel/index.md @@ -0,0 +1,352 @@ +--- +title: Ztunnel and L4 Networking +description: User guide for Istio Ambient L4 networking and mTLS using ztunnel proxy. +weight: 2 +owner: istio/wg-networking-maintainers +test: n/a +--- + +{{< warning >}} +Ambient is currently in [alpha status](/docs/releases/feature-stages/#feature-phase-definitions). + +Please **do not run ambient in production** and be sure to thoroughly review the [feature phase definitions](/docs/releases/feature-stages/#feature-phase-definitions) before use. +In particular, there are known performance, stability, and security issues in the `alpha` release. +There are also functional caveats some of which are listed in the [Caveats section](#caveats) of this guide. +There are also planned breaking changes, including some that will prevent upgrades. +These are all limitations that will be addressed before graduation to `beta`. +The current version of this guide is meant to assist early deployments and testing of the alpha version of `ambient`. The guide will continue to get updated as `ambient` itself evolves from alpha to beta status and beyond. +{{< /warning >}} + +## Introduction + +This guide describes the functionality and usage of the ztunnel proxy and Layer-4 networking functions using Istio Ambient mesh. We use a sample user journey to describe these functions. +The ztunnel (Zero Trust Tunnel) component is a purpose-built per-node proxy for Istio `ambient` mesh. Since workload pods no longer require proxies running in sidecars in order to participate in the mesh, Istio in `ambient` mode is informally also referred to as "sidecarless" mesh. + +{{< tip >}} +Pods/ workloads using sidecar proxies can co-exist within the same mesh as pods that operate in `ambient` mode and rely on the node level ztunnel proxies to participate in the mesh. Mesh pods that use sidecar proxies can also interoperate with pods in the same Istio mesh that are running in ambient mode. The term `ambient` mesh refers to an Istio mesh that has a superset of the capabilities and hence can support mesh pods that use either type of proxying. +{{< /tip >}} + + The ztunnel node proxy is responsible for securely connecting and authenticating workloads within the `ambient` mesh. The ztunnel proxy is written in Rust and is intentionally scoped to handle L3 and L4 functions in the `ambient` mesh such as mTLS, authentication, L4 authorization and telemetry. Ztunnel does not terminate workload HTTP traffic or parse workload HTTP headers. The ztunnel ensures L3 and L4 traffic is efficiently and securely transported to the waypoint proxies, where the full suite of Istio’s L7 functionality, such as HTTP telemetry and load balancing, is implemented. The term "Secure Overlay Networking" is also used informally to collectively describe the set of L4 networking functions implemented in an Ambient mesh via the ztunnel proxy. + +It is expected that some production use cases of Istio in Ambient mode may be addressed solely via the L4 Secure overlay networking features whereas other use cases will additionally need advanced traffic management and L7 networking features for which the additional waypoint proxies will need to be deployed. This is summarized in the following table. This guide focuses on functionality related to the baseline L4 and mTLS networking using ztunnel proxies. In this guide we refer to L7 only when needed to desacribe some L4 ztunnel function. However other guides are dedicated to cover the advanced L7 networking functions and the use of waypoint proxies. + + +| Application Deployment Use Case | Istio Ambient Mesh Configuration | +| ------------- | ------------- | +| Zero Trust networking via mutual-TLS, encrypted and tunneled data transport of client application traffic, L4 authorization, L4 telemetry | Baseline Ambient Mesh with ztunnel proxy networking | +| Application requires L4 Mutual-TLS plus advanced Istio traffic management features (incl VirtualService, L7 telemetry, L7 Authorization) | Full Istio Ambient Mesh configuration both ztunnel proxy and waypoint proxy based networking | + + +## Current Caveats {#caveats} +Ztunnel proxies are automatically installed when one of the supported installation methods is used to install Istio `ambient` mesh. The minimum Istio version required for Istio `ambient` mode is 1.18.0. In general Istio in ambient mode supports the existing Istio apis that are supported in sidecar proxy mode. Since the Ambient functionality is currently at an alpha release level, the following is a list of feature restrictions or caveats in the current release of Istio's ambient functionality (as of the v1.9.0 release). These restrictions are expected to be addressed/ removed in future software releases as Ambient graduates to beta and eventually GA level of maturity. + +* Kubernetes (K8s) only: Istio in ambient mode is currently only supported for deployment on Kubernetes clusters. Deployment on non-Kubernetes endpoints such as virtual machines is not currently supported. + +* No Istio Multi-cluster support: Only single cluster deployments are currently supported for Istio Ambient mode. + +* K8s CNI restrictions: Istio in Ambient mode does not currently work alongside every single Kubernetes CNI plugin implementations Additionally, with some plugins. certain CNI functions (in particular Kubernetes NetworkPolicy and Kubernetes Service Load balancing features) may get transparently bypassed in the presence of Istio Ambient mode. The exact set of supported CNI plugins as well as any CNI feature caveats are currently under test and will be formally documented as Istio's ambient mode approaches the beta release. + +* TCP/ IPv4 only: In the current release, TCP over IPv4 is the only protocol supported for transport on an Istio HBONE tunnel (this includes protocols such as HTTP that run between application layer endpoints on top of the TCP/ IPv4 combination independent of the outermost HTTP used for the HBONE transport between ztunnel proxies). + +* No dynamic switching to ambient mode: Ambient mode can only be enabled on a new Istio mesh control plane that is deployed using ambient profile or helm configuration. An existing Istio mesh deployed using a pre-ambient profile for instance can not be dynamically switched to also enable ambient mode operation. + +* Restrictions with Istio PeerAuthentication resource api: as of the time of writing, the PeerAuthentication resource is not supported by all components (i.e. Waypoint proxies) in Istio ambient mode. Hence it is recommended to only use the STRICT mTLS mode currently, or simply use default Istio settings which automatically enable mTLS on all endpoints. Like many of the other alpha stage caveats, this shall be addressed as the feature moves toward beta status. + +* There may be some minor functional gaps in areas such as istioctl CLI output displays when it comes to displaying or monitoring Istio's ambient mode related information. These will be addressed as the feature matures. + +### Environment used for this guide + +For the examples in this guide, we used a deployment of Istio version 1.19.0 on a `kind` cluster of version 0.20.0 running Kubernetes version 1.27.3. However these should also work on any Kubernetes cluster at version 1.24.0 or later and Istio version 1.18.0 or later. It would be recommended to have a cluster with more than 1 worker node in order to fully exercise the examples described in this guide. Refer to the [installation user guide](/docs/ops/ambient/usage/install/) or [Getting started guide](/docs/ops/ambient/getting-started/) information on installing Istio in ambient mode on a Kubernetes cluster. + +## Functional Overview + +### Control plane overview + +The figure shows an architecture summary of the ztunnel proxy function focusing on the interaction with the `istiod` control plane. + +{{< image width="100%" + link="ztunnel-architecture.png" + caption="Ztunnel architecture" + >}} + +Detailed architecture description of the ztunnel proxy is out of scope for this guide. For now we mainly note that each instance of the ztunnel proxy uses the Envoy xDS apis to receive certificates, discovery and configuration information from the istio control plane (`istiod`) on behalf of all pods and endpoints associated with it. In particular the ztunnel proxy obtains M-TLS certificates for all Service accounts of all pods that are associated with it. Since a single ztunnel proxy performs both the data plane and the control plane operations across multiple service accounts, it is a multi-tenant component of the mesh infrastructure in contrast with Istio side car proxies that handle control plane and data plane operations on a per application endpoint or pod basis. + +It is also worth noting that in Ambient mode, a simplified set of resources are used in the xDS apis for ztunnel proxy configuration. This results in improved performance (having to transmit and process a much smaller set of information that is sent from `istiod` to the ztunnel proxies) and improved troulbeshooting. Additional information on these xDS api resources are described in the ztunnel architecture guide. <> + +### Data plane overview + +Having briefly described the control plane architecture, we now briefly summarize the data plane architecture. This is depicted in the following figure. + +{{< image width="100%" + link="ztunnel-datapath-1.png" + caption="Basic ztunnel L4-only datapath" + >}} + +The figure depicts ambient pod workloads running on two nodes W1 and W2 of a Kubernetes cluster. There is a single instance of the ztunnel proxy on each node. In this scenario, application client pods C1, C2 and C3 need to access a service provided by pod S1 and there is no requirement for advanced L7 services such as L7 traffic routing or L7 traffic manaagementand hence no Waypoint proxy needed in the datapath. + +The figure shows that pods C1 and C2 running on node W1 connect with pod S1 running on node W2 and their TCP traffic is tunneled through a single shared HBONE tunnel instance that has been created between the ztunnel proxy pods of each node. Mutual TLS (mTLS) is used for encryption as well as mutual authentication of traffic getting tunneled. SPIFFE identities are used as the workload identities for each side as in sidecar based Istio. The term `HBONE` (for HTTP Based Overlay Network Encapsulation) is used in Istio Ambient to refer to a technique for transparently and securely tunneling TCP packets encapsulated within HTTPS packets. Some brief additional notes on HBONE are provided in a following subsection. + +We note that the figure also shows traffic from pod C3 destined to destination pod S1 on worker node W2 also traverses the local ztunnel proxy instance so that L4 traffic management functions such as L4 Authorization and L4 Telemetry are enforced identically on traffic whether or not it crosses a node boundary. + +The next figure illustrates the data path for a different use case where there is a need to traverse through an interim Waypoint proxy. This is the case when the application service requires advanced L7 traffic routing, management or policy handling. Here ztunnel L4 networking is used to tunnel traffic torwads a Waypoint proxy for L7 processing and then mapped via a second HBONE tunnel towards the ztunnel on the node hosting the selected service destination pod. In general the Waypoint proxy may or may not be located on the same nodes as the source or destinaton pods. + +{{< image width="100%" + link="ztunnel-waypoint-datapath.png" + caption="Basic ztunnel L4 + L7 datapath via an interim Waypoint proxy" + >}} + + +Finally we illustrate the concept of hairpinning in the data plane via the next figure. We noted earlier that traffic is always sent to a destination pod by first sending it to the ztunnel proxy on the same node as the destination pod. But what if the sender is either completely outside the Istio ambient mesh and hence does not initiate HBONE tunnels to the destination ztunnel first? or what if the sender is a malicious entity trying to send traffic directly to an ambient pod destination ? In this case, the ztunnel traffic redirection (implemented either via ipTables or ebpf or any other option) also implements traffic hairpinning by intercepting such traffic and requiring it go through the ztunnel proxy on the node so that any Istio L4 Authorization policy or Telemetry functions can be enforced before the traffic is accepted for forwarding to the destination pod. The ztunnel redirection logic also hairpins such traffic to a Waypoint proxy if it detects that the destination service required Waypoint processing but this traffic has not been sent by one of the Waypoint proxies associated with that service. + +{{< image width="100%" + link="ztunnel-hairpin.png" + caption="Ztunnel traffic hairpinning" + >}} + +### Note on HBONE + +HBONE (HTTP Based Overlay Network Encapsulation) is an Istio and Ambient specific term. It refers to the use of standard HTTP tunneling via the [HTTP CONNECT](https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/CONNECT) method to transparently tunnel application packets/ byte streams. In its current implementation within Istio, it transports TCP packets only by tunneling these transparently using the HTTP CONNECT method, uses [HTTP/2](https://httpwg.org/specs/rfc7540.html), with encryption and mutual authentication provided by [mutual TLS](https://www.cloudflare.com/learning/access-management/what-is-mutual-tls/) and the HBONE tunnel itself runs on TCP port 15008. The overall HBONE packet format from IP layer onwards is depicted in the following figure. + +{{< image width="100%" + link="hbone-packet.png" + caption="HBONE L3 packet format" + >}} + +In future Istio Ambient may also support [HTTP/3 (QUIC)](https://datatracker.ietf.org/doc/html/rfc9114) based transport and will be used to transport all types of L3 and L4 packets including native IPv4, IPv6, UDP by leveraging new standards such as CONNECT-UDP and CONNECT-IP being developed as part of the [IETF MASQUE](https://ietf-wg-masque.github.io/) working group. Such additional use cases of HBONE and HTTP tunneling in Istio's ambient mode are currently for further investigation. + +## Deploying an Application + +Normally, a user with Istio admin privileges will deploy the Istio mesh infrastructure. Once Istio is successfully deployed in `ambient` mode, it will be transparently available to applications deployed by all users in namespaces that have been annoted to use Istio `ambient` as illustrated in the examples below. + +### Basic application deployment without Ambient + +Let's first deploy a simple HTTP client server application without making it part of the Istio ambient mesh. We can pick from the apps in the samples folder of the istio repository. Execute the following examples from the top of a local Istio repository or istio folder created by downloading the istioctl client as described in istio guides. + +{{< text bash >}} +$ kubectl create ns ambient-demo +$ kubectl apply -f samples/httpbin/httpbin.yaml -n ambient-demo +$ kubectl apply -f samples/sleep/sleep.yaml -n ambient-demo +$ kubectl apply -f samples/sleep/notsleep.yaml -n ambient-demo +$ kubectl scale deployment sleep --replicas=2 -n ambient-demo +{{< /text >}} + +These manifests should deploy the sleep and notsleep pods which we shall use as clients for the httpbin service pod (for simplicity, the cli outputs have been deleted in the code samples above). We also create multiple replicas of the client deployment in order to exercise various scenarios. + +{{< text bash >}} +$ kubectl get pods -n ambient-demo +{{< /text >}} +{{< text syntax=plain snip_id=none >}} +NAME READY STATUS RESTARTS AGE +httpbin-648cd984f8-7vg8w 1/1 Running 0 31m +notsleep-bb6696574-2tbzn 1/1 Running 0 31m +sleep-69cfb4968f-mhccl 1/1 Running 0 31m +sleep-69cfb4968f-rhhhp 1/1 Running 0 31m +{{< /text >}} + +$ kubectl get svc httpbin -n ambient-demo +{{< text syntax=plain snip_id=none >}} +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +httpbin ClusterIP 10.110.145.219 8000/TCP 28m +{{< /text >}} + +Note that each application pod has just 1 container running in it (the "1/1" indicator) and that `httpbin` is an http service listening on `ClusterIP` service port 8000. We should now be able to `curl` this service from either client pod and confirm it returns the `httpbin` web page as shown below. At this point there is no `TLS` of any form being used. + + +{{< text bash >}} +$ kubectl exec -it deploy/sleep -n ambient-demo -- curl httpbin:8000 -s | grep title -m 1 +{{< /text >}} +{{< text syntax=plain snip_id=none >}} + httpbin.org +{{< /text >}} + +### Enabling Ambient for an application + +We now enable `ambient` for the application deployed in the prior subsection by simply adding the label `istio.io/dataplane-mode=ambient` to the application's namespace as shown below. A namespace should not be enabled for both sidecar mode and ambient mode at the same time. For now, we focus on using `ambient` mode only. In a following subsection we describe how conflicts are resolved in hybrid scenarios that mix mix sidecar mode and ambient mode within the same mesh. + +{{< text bash >}} +$ kubectl label namespace ambient-demo istio.io/dataplane-mode=ambient +$ kubectl get pods -n ambient-demo +{{< /text >}} +{{< text syntax=plain snip_id=none >}} +NAME READY STATUS RESTARTS AGE +httpbin-648cd984f8-7vg8w 1/1 Running 0 78m +notsleep-bb6696574-2tbzn 1/1 Running 0 77m +sleep-69cfb4968f-mhccl 1/1 Running 0 78m +sleep-69cfb4968f-rhhhp 1/1 Running 0 78m +{{< /text >}} + +Further, we see that after this, we still see only 1 container per application pod and the uptime of these pods indicates these were not restarted in order to enable `ambient` mode (unlike `sidecar` mode which does restart application pods when the sidecar proxies are injected). This results in better user experience and operational performance since `ambient` mode can seamlessly be enabled (or disabled) completely transparently as far as the application endpoints are concerned. We can initiate a `curl` request again from one of the client pods to the service and again verify that this works now while in `ambient` mode. +{{< text bash >}} +$ kubectl exec -it deploy/sleep -n ambient-demo -- curl httpbin:8000 -s | grep title -m 1 +{{< /text >}} +{{< text syntax=plain snip_id=none >}} + httpbin.org +{{< /text >}} + +This indicates the traffic path is working. In the next section we look at how to monitor the confuguration and data plane of the ztunnel proxy to confirm that traffic is correctly using the ztunnel proxy. + +## Monitoring the ztunnel proxy & L4 networking + +In this section, we describe some options for monitoring the ztunnel proxy configuration and data path. This information can also help with some high level troubleshooting and in identifying information that would be useful to collect and provide in a bug report if there are any problems. Additional advanced monitoring of ztunnel internals and advanced troubleshooting is out of scope for this guide. + +### Viewing ztunnel proxy state + +As indicated previously, the `ztunnel` proxy on each node gets configuration and discovery information from the `istiod` component via `xDS` APIs. Use the `istioctl proxy-config` command as shown below to view discovered ambient workloads as seen by a ztunnel proxy as well as secrets holding the TLS certificates that the Ztunel proxy has received from the istiod control plane to use in mTLS signaling on behalf of the local workloads. + +In the first example, we see all the workloads and control plane components that the specific ztunnel pod `ztunnel-gkldc` is tracking including information about the IP address and protocol to use when connecting to that component and whether there is a Waypoint proxy associated with that workload. + +{{< text bash >}} +$ istioctl proxy-config workloads ztunnel-gkldc.istio-system +{{< /text >}} +{{< text syntax=plain snip_id=none >}} +NAME NAMESPACE IP NODE WAYPOINT PROTOCOL +coredns-6d4b75cb6d-ptbhb kube-system 10.240.0.2 amb1-control-plane None TCP +coredns-6d4b75cb6d-tv5nz kube-system 10.240.0.3 amb1-control-plane None TCP +httpbin-648cd984f8-2q9bn ambient-demo 10.240.1.5 amb1-worker None HBONE +httpbin-648cd984f8-7dglb ambient-demo 10.240.2.3 amb1-worker2 None HBONE +istiod-5c7f79574c-pqzgc istio-system 10.240.1.2 amb1-worker None TCP +local-path-provisioner-9cd9bd544-x7lq2 local-path-storage 10.240.0.4 amb1-control-plane None TCP +notsleep-bb6696574-r4xjl ambient-demo 10.240.2.5 amb1-worker2 None HBONE +sleep-69cfb4968f-mwglt ambient-demo 10.240.1.4 amb1-worker None HBONE +sleep-69cfb4968f-qjmfs ambient-demo 10.240.2.4 amb1-worker2 None HBONE +ztunnel-5jfj2 istio-system 10.240.0.5 amb1-control-plane None TCP +ztunnel-gkldc istio-system 10.240.1.3 amb1-worker None TCP +ztunnel-xxbgj istio-system 10.240.2.2 amb1-worker2 None TCP +{{< /text >}} + +In the second example, we see the list of TLS certificates that this ztunnel proxy instance has received from istiod to use in TLS signaling. + +{{< text bash >}} +$ istioctl proxy-config secrets ztunnel-gkldc.istio-system +{{< /text >}} +{{< text syntax=plain snip_id=none >}} +NAME TYPE STATUS VALID CERT SERIAL NUMBER NOT AFTER NOT BEFORE +spiffe://cluster.local/ns/ambient-demo/sa/httpbin CA Available true edf7f040f4b4d0b75a1c9a97a9b13545 2023-09-20T19:02:00Z 2023-09-19T19:00:00Z +spiffe://cluster.local/ns/ambient-demo/sa/httpbin Cert Chain Available true ec30e0e1b7105e3dce4425b5255287c6 2033-09-16T18:26:19Z 2023-09-19T18:26:19Z +spiffe://cluster.local/ns/ambient-demo/sa/sleep CA Available true 3b9dbea3b0b63e56786a5ea170995f48 2023-09-20T19:00:44Z 2023-09-19T18:58:44Z +spiffe://cluster.local/ns/ambient-demo/sa/sleep Cert Chain Available true ec30e0e1b7105e3dce4425b5255287c6 2033-09-16T18:26:19Z 2023-09-19T18:26:19Z +spiffe://cluster.local/ns/istio-system/sa/istiod CA Available true 885ee63c08ef9f1afd258973a45c8255 2023-09-20T18:26:34Z 2023-09-19T18:24:34Z +spiffe://cluster.local/ns/istio-system/sa/istiod Cert Chain Available true ec30e0e1b7105e3dce4425b5255287c6 2033-09-16T18:26:19Z 2023-09-19T18:26:19Z +spiffe://cluster.local/ns/istio-system/sa/ztunnel CA Available true 221b4cdc4487b60d08e94dc30a0451c6 2023-09-20T18:26:35Z 2023-09-19T18:24:35Z +spiffe://cluster.local/ns/istio-system/sa/ztunnel Cert Chain Available true ec30e0e1b7105e3dce4425b5255287c6 2033-09-16T18:26:19Z 2023-09-19T18:26:19Z +{{< /text >}} + +Using these cli commands, a user can check that ztunnel proxies are getting configured with all the expected workloads and TLS certificates and missing information can be used for troubleshooting to explain any potential observed networking errors. A user may also use the `all` option to view all parts of the proxy-config with a single cli command and the json output formatter as shown in the example below to display the complete set of available state information. + +{{< text bash >}} +$ istioctl proxy-config all ztunnel-gkldc.istio-system -o json | jq +{{< /text >}} + +Note that when used with a ztunnel proxy instance, not all cli options of the `istioctl proxy-config` cli are supported since some apply only to side car proxies. + +A more advanced user may also choose to view the "raw" configuration dump of a ztunnel proxy via a curl to the endpoint inside a ztunnel proxy pod as shown in the following example. + +{{< text bash >}} +$ kubectl exec -it ds/ztunnel -n istio-system -- curl http://localhost:15000/config_dump | jq . +{{< /text >}} + + +### Viewing Istiod state for ztunnel xDS resources + +Sometimes an advanced user may want to view the state of ztunnel proxy config resources as maintained in the istiod control plane, in the format of the xDS API resources defined specially for ztunnel proxies. This can be done by execing into the istiod pod and obtaining this information from port 15014 for a given ztunnel proxy as shown in the example below. This output can then also be saved and viewed with a json pretty print formatter utility for easier browsing (not shown in the example). + +{{< text bash >}} +$ kubectl exec -it -n istio-system deploy/istiod -- curl localhost:15014/debug/config_dump?proxyID=ztunnel-25fnd.istio-system +{{< /text >}} + + +### Verifying ztunnel traffic logs + +Let us send some traffic from a client `sleep` pod to the `httpbin` service. + +{{< text bash >}} +$ kubectl -n ambient-demo exec deploy/sleep -- sh -c 'for i in $(seq 1 10); do curl -s -I http://httpbin:8000/; done' +{{< /text >}} +The response displayed confirms the client pod receives responses from the service. +{{< text syntax=plain snip_id=none >}} +HTTP/1.1 200 OK +Server: gunicorn/19.9.0 +--snip-- +{{< /text >}} + +Now lets check logs of the `ztunnel` pods to confirm the traffic was sent over the HBONE tunnel. + +{{< text bash >}} +$ kubectl -n istio-system logs -l app=ztunnel | egrep "inbound|outbound" +{{< /text >}} +{{< text syntax=plain snip_id=none >}} +2023-08-14T09:15:46.542651Z INFO outbound{id=7d344076d398339f1e51a74803d6c854}: ztunnel::proxy::outbound: proxying to 10.240.2.10:80 using node local fast path +2023-08-14T09:15:46.542882Z INFO outbound{id=7d344076d398339f1e51a74803d6c854}: ztunnel::proxy::outbound: complete dur=269.272µs +--snip-- +{{< /text >}} + +These log messages confirm the traffic indeed used the `ztunnel` proxy in the datapath. Additional fine grained monitoring can be done by checking logs on the specific `ztunnel` proxy instances that are on the same nodes as the source and destination pods of traffic. If these logs are not seen, then a possibility is that traffic redirection may not be working correctly. Detailed description of monitoring and troubleshooting of the traffic redirection logic is out of scope for this guide. Note that as mentioned prefviously, with `ambient` traffic always traverses the ztunnel pod even when the source and destination of the traffic are on the same compute node. + +### Verifying ztunnel load balancing + +The ztunnel proxy automatically performs client-side load balancing if the destination is a service with multiple endpoints. No additional configuration is needed. The ztunnel load balancing algorithm is an internally fixed L4 Round Robin algorithm that distributes traffic based on L4 connection state and is not user configurable. + +{{< tip >}} +If the destination is a service with multiple instances or pods and there is no Waypoint associated with the destination service, then the source ztunnel proxy performs L4 load balancing directly across these instances or service backends and then sends traffic via the remote ztunnel proxies associated with those backends. If the destination service does have a Waypoint deployment (with one or more backend instances of the Waypoint proxy) associated with it, then the source ztunnel proxy performs load balancing by distributing traffic across these Waypoint proxies and sends traffic via the remote ztunnel proxies associated with the Waypoint proxy instances. +{{< /tip >}} + + +Lets repeat the previous example with multiple replicas of the service pod and verify that client traffic is load balanced across the service replicas. + +{{< text bash >}} +$ kubectl -n ambient-demo scale deployment httpbin --replicas=2 +{{< /text >}} +{{< text bash >}} +$ kubectl -n ambient-demo exec deploy/sleep -- sh -c 'for i in $(seq 1 10); do curl -s -I http://httpbin:8000/; done' +{{< /text >}} +{{< text bash >}} +$ kubectl -n istio-system logs -l app=ztunnel | egrep "inbound|outbound" +{{< /text >}} + +{{< text syntax=plain snip_id=none >}} +2023-08-14T09:33:24.969996Z INFO inbound{id=ec177a563e4899869359422b5cdd1df4 peer_ip=10.240.2.16 peer_id=spiffe://cluster.local/ns/ambient-demo/sa/sleep}: ztunnel::proxy::inbound: got CONNECT request to 10.240.1.11:80 +2023-08-14T09:33:25.028601Z INFO inbound{id=1ebc3c7384ee68942bbb7c7ed866b3d9 peer_ip=10.240.2.16 peer_id=spiffe://cluster.local/ns/ambient-demo/sa/sleep}: ztunnel::proxy::inbound: got CONNECT request to 10.240.1.11:80 + +--snip-- + +2023-08-14T09:33:25.226403Z INFO outbound{id=9d99723a61c9496532d34acec5c77126}: ztunnel::proxy::outbound: proxy to 10.240.1.11:80 using HBONE via 10.240.1.11:15008 type Direct +2023-08-14T09:33:25.273268Z INFO outbound{id=9d99723a61c9496532d34acec5c77126}: ztunnel::proxy::outbound: complete dur=46.9099ms +2023-08-14T09:33:25.276519Z INFO outbound{id=cc87b4de5ec2ccced642e22422ca6207}: ztunnel::proxy::outbound: proxying to 10.240.2.10:80 using node local fast path +2023-08-14T09:33:25.276716Z INFO outbound{id=cc87b4de5ec2ccced642e22422ca6207}: ztunnel::proxy::outbound: complete dur=231.892µs + +--snip-- +{{< /text >}} + +Here we note the logs from the ztunnel proxies first indicating the http CONNECT request to the new destination pod (10.240.1.11) which indicates the setup of the HBONE tunnel to the node hosting the additional destination service pod. This is then followed by logs indicating the client traffic being sent to both 10.240.1.11 and 10.240.2.280 which are the two destination pods providing the service. Also note that the data path is performing client-side load balancing in this case and not depending on Kubernetes service load balancing. + +This is a Round robin load balancing algorithm and is separate from and independent of any load balancing algorithm that may be configured within a VirtualService's TrafficPolicy field, since as discussed previously, all aspects of VirtualService api objects are instantiated on the Waypoint proxies and not the ztunnel proxies. + +### Pod selection logic for Ambient and Sidecar modes +Istio with sidecar proxies can co-exist with `ambient` based node level proxies within the same compute cluster. It is important to ensure that the same pod or namespace does not get configured to use both a sidecar proxy and an ambient node-level proxy. However if this does occur, currently sidecar injection takes precedence for such a pod or namespace. + +Note that two pods within the same namespace could in theory be set to use different modes by labeling individual pods separately from the namespace label, however this is not recommended. For most common use cases it is recommended that a single mode be used for all pods within a single namespace. + +The exact logic to determine whether a pod is setup to use ambient mode is as follows. + +1. The `istio-cni` plugin configuration exclude list configured in `cni.values.excludeNamespaces` is used to skipnamespaces in the exclude list. +2. `ambient` mode is used for a pod if +- The namespace has label "istio.io/dataplane-mode" == "ambient" +- The annotation "sidecar.istio.io/status" is not present on the pod +- "ambient.istio.io/redirection" is not "disabled" + +The simplest option to avoid a configuration conflict is for a user to ensure that for each namespace, it either has the labelfor sidecar injection (`istio-injection=enabled`) or for ambient data plane mode (`istio.io/dataplane-mode=ambient`) but never both. + + +## L4 Authorization Policy + +As mentioned previously, the ztunnel proxy performs Authorization policy enforcement when it requires only L4 traffic processing in order to enforce the policy in the data plane and there are no Waypoints involved. The actual enforcement point is at the receiving (or server side) ztunnel proxy in the path of a connection. + +In the following example, we add a simple L4 Authorization policy to the same application we deployed earlier and confirm the policy enforcement. + + +## Monitoring and Telemetry with ztunnel +TODO + +## Co-existence of Ambient/ ztunnels with Side car proxies +TODO + diff --git a/content/en/docs/ops/ambient/usage/ztunnel/ztunnel-architecture.png b/content/en/docs/ops/ambient/usage/ztunnel/ztunnel-architecture.png new file mode 100644 index 0000000000000..0500f12b7a958 Binary files /dev/null and b/content/en/docs/ops/ambient/usage/ztunnel/ztunnel-architecture.png differ diff --git a/content/en/docs/ops/ambient/usage/ztunnel/ztunnel-datapath-1.png b/content/en/docs/ops/ambient/usage/ztunnel/ztunnel-datapath-1.png new file mode 100644 index 0000000000000..6a6017ae1381e Binary files /dev/null and b/content/en/docs/ops/ambient/usage/ztunnel/ztunnel-datapath-1.png differ diff --git a/content/en/docs/ops/ambient/usage/ztunnel/ztunnel-hairpin.png b/content/en/docs/ops/ambient/usage/ztunnel/ztunnel-hairpin.png new file mode 100644 index 0000000000000..eb772868a6d73 Binary files /dev/null and b/content/en/docs/ops/ambient/usage/ztunnel/ztunnel-hairpin.png differ diff --git a/content/en/docs/ops/ambient/usage/ztunnel/ztunnel-waypoint-datapath.png b/content/en/docs/ops/ambient/usage/ztunnel/ztunnel-waypoint-datapath.png new file mode 100644 index 0000000000000..79b5b2333fd60 Binary files /dev/null and b/content/en/docs/ops/ambient/usage/ztunnel/ztunnel-waypoint-datapath.png differ