-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mutual TLS with customer provided certificates #34
Comments
+1 |
Potentially implemented by #68 |
Hi guys, I've noted that you have moved this issue from "Researching" to "We're Working On It". Do you have now a delivery date? Even if it's a quarter indication would be awesome! Thx, |
We are targeting to deliver by EOY 2019, but this is subject to change and we'll keep this issue updated as we know more. |
Hi there! Any update on the work on this request? |
@juandiegopalomino While you're here, a couple of questions to help us validate our design :)
|
@bcelenza This is an important feature for Open Banking customers in Australia. The pilot is in progress with the first phase of the rollout on July 1, 2020. The regs stipulate mTLS for certain endpoints. In this scenario:
|
I've got one use case for you - transparent authentication of services inside a AppMesh to AWS MSK. |
@kamil-rogon-dragon Could you elaborate? I'm unfamiliar with Kafka/MSK, just taking a cursory look over their docs; it looks like you can source your own certs via ACMPCA. Are you looking to represent MSK within the mesh? Right now with #39 you can configure your services to retrieve a cert and a validation context for the PCA you've configured for your brokers. Assuming you can represent your cluster as a virtual service using existing primitives (looks like you get a list of IPs for the brokers?), at a high level that could be sufficient to configure Envoy correctly; though of course you're only getting CA validation on MSK's side (are you looking for stricter trust?). |
Hi guys, current setup:
As far as I know, in order to connect these two services together, we would need (#97): My question, would this feature support mTLS between the two messes? How would I have to secure the link between the two meshes? |
@isaac-mj I believe you're right on your cross-region routing steps. As @dastbe alluded to there, having multi-region services natively represented is the ideal. For encryption, the mTLS case of cross-region communication shouldn't be any different than it is with the encryption support we have today. I need to validate this (and have recorded your comment accordingly), but I think the only issue is retrieving secrets across regions - functionality shouldn't be impeded. You'll have to ensure that the root CA certificate that you're using in a validation context is retrievable/present for a given virtual node in a region; for file system-based and SDS-based certs, that's on you to provide to the proxies. But if you're using ACM, I don't think we can currently relax the same-region requirement, so the entity and CA certs need to be in the same region as the virtual node that's referencing them. e.g. the root CA cert for the ingress gateway in |
Hello everyone, I have a follow-up question. We are launching AppMesh together with ACM to setup TLS between our mesh nodes. However, we are wondering about the rotation frequency that you will implement for mTLS. For example, looking at the frequency of Consul (https://www.consul.io/docs/connect/ca/aws), they rotate certificates for their nodes every 54 hours. We did a brief calculation of the costs of that type of rotation in our current platform assuming we use ACM and the costs would skyrocket. Have you defined a rotation policy for the mTLS feature? Thx, |
@isaac-mj Thanks for launching with us! Our integration with ACM (currently) has you bringing your certs to App Mesh (as you're aware). So rotation is entirely within your control and likely will not dramatically change with mTLS. ACM's cert validity period is currently fixed, so we're looking into revocation (#172) and how we can make CRLs work and scale. I believe ACM is working on improving the validity story, though I can't speak to their designs or roadmap. That doesn't directly answer your question, but let me know if it doesn't help. |
I have a question re: whether/how mTLS support will enable a service to authorise an invocation from a downstream service. For example, as part of the virtual node listener spec, will it be possible to specify a list of acceptable downstream services, or a list of downstream services to block, based on the domain names associated with their certs? And would it be possible to specify these as part of a virtual service spec in addition to a virtual node spec? |
@rizblie Thanks for your questions. The API is still in flux but we'll have a draft here once we're working towards a preview version of the feature. If using the Subject Alternative Names on certs is sufficient as a a coarse authorization mechanism, this is certainly something we can support. We are also researching external authorizers (#140) for more control, feel free to chime in there if that's of interest. Do you have requirements on the structure/format/count of your SANs specified on an upstream node?
This is a great question. We've focused on providing primitives via virtual nodes, but we recognize there's a lot of configuration there, some of which may make sense to specify in a higher level construct, e.g. a virtual service or something else. Would you prefer to specify this "SAN validation" on a virtual service? What about CAs (e.g. TLS client policies on backends)? Do you find yourself specifying the same configuration on several virtual nodes behind a virtual service? |
Thanks @efe-selcuk . While an external authorizer is certainly desirable for maximum flexibility, I believe there is also room for a simpler built-in mechanism based on the cert's SAN to address common, basic scenarios e.g. "I want to allow my service to be called by any downstream service in *.payments.local", or "I want to explicitly block downstream services in *.search.local". A simple approach would be to offer options to specify either an ALLOW list or a DENY list. Both options would be useful, but if forced to choose I think an ALLOW list would be more valuable - as it gives the owning team full control over who is allowed to access their service. These ALLOW/DENY lists could be specified either at the virtual service level or the virtual node level. I think it makes more sense to do this at the service level, as the owning team is likely to want to apply the same policy to all virtual nodes that the service routes to. For example, if using weighted routing to two different versions of the service (canary deployment), then it would not make sense for one version to accept requests from a downstream service, while the other was rejecting them. RE: CAs, yes I think it also makes sense to specify config at the service level, as in most cases all virtual nodes under the same virtual service parent would require the exact same configuration. For situations where additional flexibility is required at the virtual node level, perhaps you could have a two-tier system - where virtual nodes inherit the policy/config from their parent virtual service, but can optionally override them at the virtual node level? |
@rizblie The extra feedback is much appreciated. I can at least say we're acutely aware of the problems you're trying to solve. At the risk of getting too into the weeds here... I'm hesitant to use SANs for anything outside of their specific domain; for example, the dns name for a virtual node or service doesn't necessarily map 1:1 to the SAN on the cert (like a SPIFFE SVID). It wouldn't exist in terms of a generic allow/deny list when modeling your services in app mesh, but I call it a "coarse authorization mechanism" exactly because it coincidentally ends up acting as an allow list when using TLS. That being said, there is absolutely a need for better authorization controls in a way that maps better into the mesh (even simpler controls as you've described), and that's where the more focused authorization discussions come into play. In terms of the rules around accepting SANs, there are several concerns, including what we have available in envoy's api (vs balancing against new contributions of course). For example, the mechanism to validate those SANs we have in one version versus a newer version. Wildcards are also sensitive just because of context. As for specifying TLS configuration at a virtual service level, it's tricky both from an API confusion perspective as well as a "do the right thing with override behavior" perspective. For example, in TLS client policies on virtual node backends, we allow defaults with overrides, but we explicitly do not merge the fields. |
@efe-selcuk I see your point. Thinking about it a bit more, a better mesh solution might be to employ a similar approach as for backends i.e. just like a virtual node can reference a backend using a virtual service name or ARN, a virtual service could specify a frontend service to specify which downstream services are allowed to call it. The same dilemma arises re: whether to apply at service or node level. I would argue that service level is simpler and more useful. As soon as you introduce this at node level, you have to deal with the problem that the downstream service may be routing to a mix of virtual nodes, some of which are valid frontends for the target service, while others are not. This will lead to authZ errors if the routing does not take this into account. |
@rizblie A few questions:
We have to balance these against any security implications of course. |
Hello everyone. Today, you can encrypt communication between your Envoy proxies with TLS by providing a certificate from the listeners on your upstream/server virtual nodes and gateways, and specifying validation criteria (trusted Certificate Authority) on your clients/downstreams. App Mesh will be introducing support for authentication with Mutual TLS. In broad terms, this will allow you to mutually authenticate communication between your virtual nodes and virtual gateways (and external services), by also providing a client certificate on your downstream Envoys, and specifying validation criteria on upstream Envoys. Additionally, you’ll be able to optionally specify the Subject Alternative Names which must appear on the peer certificate as part of validation. Mechanically, this will involve new APIs for:
We will also be introducing support for external Secret Discovery Services (SDS) via unix domain socket. We are investigating support for SPIRE as an SDS provider in #68 and are generally looking into the experience across our platforms (i.e., ECS, Fargate, EKS). The initial release of this feature will support file-based and SDS-based certificates and certificate authorities. We will explore the options for supporting ACM PCA after the initial release (#258). This feature is an extension of our existing TLS support. For information, please see this documentation: https://docs.aws.amazon.com/app-mesh/latest/userguide/tls.html Supporting Secrets Discovery Service over Unix Socket An emerging pattern for TLS certificate binding in service mesh is through the use of the Envoy's Secret Discovery Service API. This option adds the ability for the proxy to connect to a local process (i.e. sidecar) which is hosting an SDS endpoint via a Unix Domain Socket (UDS). When using a technology like SPIRE, this would be the SPIRE Agent running on your infrastructure. API ShapesThe models we’re adding to listeners and backends are very similar to the existing TLS shapes for listener certificates and TLS client policies on backends. Listeners
Backends Note on Virtual Gateways: The tls structure here will also be present within the clientPolicy structure for Virtual Gateways.
ExampleFor brevity, most of the configuration is omitted. This simple example shows a downstream virtual node, using file-based certificates, backend by an upstream virtual node using SDS. While this illustrates mixed sources, within a mesh, you would be likely to use a single strategy (e.g. using SPIRE to vend all TLS materials via SDS to all proxies). Upstream (server)
Downstream (client)
SummaryWe hope this will enable your use-cases for mutual authentication within your meshes. We’ll update this issue once the feature is enabled in our Preview Channel. As always, we’d love to get your feedback on the feature. A few questions to get started:
Thanks! |
RE: 1, limit is fine, but wildcards would make life easier. |
Hello everyone. The Mutual TLS feature is now available in Preview. You can find documentation about the feature here: https://docs.aws.amazon.com/app-mesh/latest/userguide/mutual-tls.html And two walkthroughs:
As always, feel free to leave any feedback. Thanks! |
Please advise, if it will work with applications running on EC2? |
Hey @rberkovi Are you referring to running App Mesh with ECS/EKS on EC2 with file-based certificates? That is certainly supported alongside our other features. SPIRE on ECS (EC2 or Fargate) is not supported. |
@efe-selcuk Our setup is applications running on EC2 (Tomcat, IIS), no containers. Have ALB before them with path base routing to target group |
@rberkovi Ah, are you on-boarding with App Mesh for the first time? You can use App Mesh on EC2. We don't yet have the same resources we provide for containerized workloads (e.g. #161), so you would have to configure the iptables rules on your instances and the bootstrap configuration (see #264) for Envoy. Check out the iptables script at the bottom of this guide. The rest of our features (including TLS/mTLS) are not restricted by platform. If you have more general questions, we'd be happy to chat. |
Hello folks - Do you have an idea of when this feature will come out of preview and be released into production? |
@isaac-mj We can't share dates or timelines, but the feature is currently in active development. |
Hey everyone. Mutual TLS is now Generally Available! https://aws.amazon.com/about-aws/whats-new/2021/02/aws-app-mesh-supports-mutual-tls-authentication/ We're super excited to get this into your hands. The features are available in the AWS SDK and AWS Console. CloudFormation support is currently slated for next week. We'll post an update here once it's live. |
CloudFormation support is available in all regions. The new fields are not yet available in CloudFormation documentation, they should be published in about a week. We'll get the walkthrough updated to use CloudFormation so you have something to reference. However the field names are all in line with the API. |
CloudFormation docs have been published. Resolving this issue. Please cut us a new one if you have any findings/feedback. Thanks! |
The text was updated successfully, but these errors were encountered: