Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation on multicluster + identity + authorization policies #7235

Closed
jroper opened this issue Nov 8, 2021 · 14 comments
Closed

Documentation on multicluster + identity + authorization policies #7235

jroper opened this issue Nov 8, 2021 · 14 comments

Comments

@jroper
Copy link
Contributor

jroper commented Nov 8, 2021

As far as I can see, there's no documentation on how multi-cluster works in conjunction with linkerd's identity abstraction and authorization policies. For example, questions that I need answered that I can't seem to find an answer in the docs on are:

  • If I make a request from service A running in east to service B running in west, what identity will that request have when service B receives it? Will service B be able to recognise it as a request from service A coming from east, distinct from other services running in east, and distinct from any services running in west?
  • Can I create an authorization policy that matches requests from other clusters?
  • Can I create an authorization policy that matches requests from a particular namespace in another cluster? From a particular identity in another cluster?
@luukkemp
Copy link

luukkemp commented Nov 10, 2021

I would like to know this as well.
I've been looking into this. I don't think it's possible at this time, I came to this conclusion by tapping the request coming from the external cluster, which has the local (destination) gateway as source (seems logical).

When tapping the request we do have some information available in the HTTP headers:

  • l5d-client-id (this references the SA in the source cluster)
  • forwarded (only references stuff from the destination cluster)

It seems that it's not possible to reference the external SA seen in l5d-client-id in my serverauthorization.

I'm not sure if my findings are correct, since we just started testing Linkerd in our organization.

Thank you guys for the great work so far, and I'll have my eye on this topic :)

@adleong
Copy link
Member

adleong commented Nov 10, 2021

Hey @jroper!

You're right that this stuff is definitely under-documented. I'll try to answer your questions here as best I can here. Then we can use this issue to track the work of documenting that information somewhere more permanent and discoverable.

During a multi-cluster request, the request transits these hops: source pod -> target cluster gateway -> target pod. This mean that for the purpose of identity and authorization, the target pod will see the client identity be the gateway.
In other words, B will not be able to determine the identity of the original client. However, you can create policies that restrict which identities in the west cluster (or even which restrict which clusters) can call the gateway in east and create policies which restrict whether a service in the east cluster can be called from the gateway.

Taking this a step further, you could even create multiple different gateways with different policies to create fine grained policies about the multi-cluster traffic. However, creating multiple gateways goes beyond the default configuration in the linkerd-multicluster chart and would require some manual setup.

Hopefully that clears things up. If this all makes sense, would you be interested in helping get this information into the docs on the Linkerd website?

@jroper
Copy link
Contributor Author

jroper commented Nov 10, 2021

Right now we actually use network policies (ie firewalls) to enforce namespace isolation. We're looking into multi cluster linking, but the obvious problem is that at least with Linkerd's default setup, in order to allow communication from the linked cluster the network policy would have to allow all communication from the gateway, and this would undermine the network policies because it would mean every namespace in the remote cluster that was able to use the link could access the namespace in the local cluster. So, that's why I was wanting to look to the Linkerd authorization policies to see if that could be used.

Regarding multiple gateways though, if we deployed a gateway in each namespace, with the appropriate authorization policy set on inbound traffic, then our network policies would continue to be effective at achieving namespace isolation? The problem with that approach though is that we'd need an external IP address per namespace, which we don't want, but perhaps that could be worked around by having an unmeshed gateway that supported TLS passthrough with SNI (as I understand it, haproxy, nginx and few others support this), so it would receive the connection from the source pod, see the TLS SNI header (as I understand it, SNI headers are sent in the TLS client hello message with no encryption, so a TLS passthrough proxy can see it without having to get involved in the TLS handshake), open a connection to the appropriate linkerd gateway, and then pass on the TLS connection.

That said, it would be a nice feature to have if linkerd supported carrying remote identities and using them in authorization policies. Has any thought been given to supporting this in future?

@adleong
Copy link
Member

adleong commented Nov 18, 2021

This sounds reasonable, though it's not on our immediate roadmap and would require some design. Please reach out if this is something you're interested in scoping out or working on.

@jroper
Copy link
Contributor Author

jroper commented Nov 18, 2021

Probably too big a task for me... but I have done some thinking. I must admit my understanding on the exact details of the current multi cluster setup is a little rough... but, I think we'd need something like the following at a very high level:

  • Gateways would need to be able to identify what cluster an incoming request came from. I'm not 100% sure on how the sharing CAs works in linkerd, is it the same CA in every cluster? Or does each cluster have its own CA and gateways are configured to trust the CAs of other clusters? If the latter, then the cluster can be identified by looking at the CA that was used to sign the client certificate.
  • Gateways would need a mechanism to communicate the identity and cluster of connection they are proxying. Istio uses an approach for communicating metadata across TLS TCP connections by defining their own very lightweight tunnelling protocol that can be negotiated for use using ALPN - the use of ALPN means that it still works if you ever have a non istio client that doesn't speak that protocol connecting. I'm not completely across it, but it appears that after negotiating the use of that protocol, a small protobuf header is sent/read by the proxy, containing the metadata, as well as the next ALPN protocol to switch to before the proxy starts relaying the main connection. I guess another approach could be to define a custom TLS extension, though as I understand it it's not common for TLS libraries to expose a way to just plugin arbitrary TLS extensions, usually the support has to be in the TLS library itself. A final approach could be to use HTTP headers, but that would mean that multi cluster authorization policies could only be applied on HTTP connections and would have to be applied after HTTP decoding - I don't know the details of how linkerd's authorization is currently implemented but it wouldn't surprise me if that wasn't compatible with the existing approach.
  • The gateway identity would need to be known as a gateway that is allowed to impersonate other identities using this protocol. This could be configured as part of an authorization policy, or perhaps could be cluster wide configuration.
  • The remote cluster identities themselves we'd have to think about - would it be the same service account string as in the remote cluster, with an additional cluster name alongside it? Or would we modify the identity name to include the remote cluster information? Eg, currently identities look like: <serviceaccount>.<namespace>.serviceaccount.identity.linkerd.cluster.local, do we leave that as is and pass an additional metadata attribute along of the cluster name? Or does the gateway map that, perhaps to something like <serviceaccount>.<namespace>.serviceaccount.identity.linkerd.cluster.east if it came from the east cluster?

@jandersen-plaid
Copy link

Hey @adleong, just wanted to check in on this issue (specifically, forwarding source identities through gateways to other clusters). Has this found a place on the roadmap or does it sit squarely as a nice to have but not actively pursued feature for linkerd? Additionally, is the only solid workaround creating a gateway per service or namespace to control access?

As some context for the comment, I am currently evaluating Linkerd for providing multicluster service-to-service authentication and splitting traffic across active-active services deployed in multiple clusters (for reliability purposes). There are some workarounds we have in our back pocket:

  • Using an external IP per service (and a gateway per service) -- instead of a load balancer -- to make gateways cheaper to run
  • Making each cluster a duplicate system with weighted DNS to each cluster and disallowing cross cluster traffic (that is, abandoning multicluster entirely).
  • Setting up a heavily customized NGINX Ingress and proxying all TCP requests back to service gateways (very theoretical)

but each comes with its own issues (scaling when it comes to the external IP per gateway/service, some system specific issues with disallowing traffic, and the customization of NGINX is a theory still), but forwarding the identity through the gateway would remove any need for the workarounds.

@whiskeysierra
Copy link

whiskeysierra commented Feb 8, 2023

When tapping the request we do have some information available in the HTTP headers:

  • l5d-client-id (this references the SA in the source cluster)
  • forwarded (only references stuff from the destination cluster)

It seems that it's not possible to reference the external SA seen in l5d-client-id in my serverauthorization.

When we read that a colleague of mine had the idea to use an AuthorizationPolicy in conjunction with an HTTPRoute and an HTTPHeaderMatch for the l5d-client-id header.
But when we tested that, what we saw was that the header contains the gateway identity, which makes sense.

@whiskeysierra
Copy link

but perhaps that could be worked around by having an unmeshed gateway that supported TLS passthrough with SNI (as I understand it, haproxy, nginx and few others support this), so it would receive the connection from the source pod, see the TLS SNI header (as I understand it, SNI headers are sent in the TLS client hello message with no encryption, so a TLS passthrough proxy can see it without having to get involved in the TLS handshake), open a connection to the appropriate linkerd gateway, and then pass on the TLS connection.

When I started reading the Linkerd docs (before actually trying/using it), I was hoping that this would be the behavior of the linkerd-gateway today: A transparent proxy which leaves the TLS traffic untouched and just looks at the SNI header to figure out to which service to route to. That way the mTLS connection between the source and target service would be unbroken and the correct identity would be available at the destination for authorization.

@whiskeysierra
Copy link

As it turns out, the gateway does set a very helpful header (HTTP only, obviously):

Forwarded: by=linkerd-gateway.linkerd-multicluster.serviceaccount.identity.linkerd.cluster.local;for=some-svc.some-app.serviceaccount.identity.linkerd.cluster.local;host=my-svc-alpha.my-app.svc.cluster.local:8080;proto=https

The value is essentially a set of key-value pairs, separated by a semicolon and it contains:

  • by = the gateway identity
  • for = the downstream service identity
  • host = the FQDN and port of the gateway resolved in the local cluster
  • proto = the protocol

You can use that knowledge to create a dedicated policy for gateway traffic:

apiVersion: policy.linkerd.io/v1beta1
kind: HTTPRoute
metadata:
  name: "my-svc-gateway"
spec:
  parentRefs:
    - name: "my-svc-http"
      kind: Server
      group: policy.linkerd.io
  rules:
    - matches:
        - headers:
            - name: Forwarded
              type: RegularExpression
              value: "^(.+;)?for=some-svc\\.some-app\\.serviceaccount\\.identity\\.linkerd\\.cluster\\.local(;.+)?$"
---
apiVersion: policy.linkerd.io/v1alpha1
kind: MeshTLSAuthentication
metadata:
  name: "my-svc-gateway"
spec:
  identities:
    - "linkerd-gateway.linkerd-multicluster.serviceaccount.identity.linkerd.cluster.local"

@whiskeysierra
Copy link

@adleong Can you comment on the validity of my approach ☝️ and maybe also comment on whether the Forwarded header is going to stay, because I couldn't find any documentation about (not even a mention in an issue or slack thread).

@adleong
Copy link
Member

adleong commented Apr 19, 2023

@whiskeysierra nice! I hadn't thought of using the Forwareded header like that, but I'd say it's valid! We currently don't have any plans to remove the Forwarded header.

@jroper
Copy link
Contributor Author

jroper commented Sep 26, 2023

I think the Forwarded header also will meet our needs. We're close to using multi cluster.

@adleong
Copy link
Member

adleong commented Oct 4, 2023

As of Linkerd 2.14.0, we now support multicluster direct pod-to-pod communication which bypasses the gateway entirely. This allows you to create authorization policies just as you would in the single cluster case. This is documented here: https://linkerd.io/2.14/tasks/pod-to-pod-multicluster/#step-6-authorization-policy

If these docs are sufficient, I think we can close this issue. If not, I'd love to hear where we can improve them (or, even better, get your help improving them!)

@hawkw
Copy link
Contributor

hawkw commented Oct 19, 2023

We're going to go ahead and close this issue as it should be solved by pod-to-pod multicluster, as discussed in #7235 (comment). Please feel free to comment if that doesn't solve your problem, and we can re-open this, or open a new issue as appropriate. Thank you!

@hawkw hawkw closed this as completed Oct 19, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants