Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Add support for AWS Cloud Map as service-discovery #52

Closed
kiranmeduri opened this issue Aug 1, 2019 · 8 comments
Closed

Proposal: Add support for AWS Cloud Map as service-discovery #52

kiranmeduri opened this issue Aug 1, 2019 · 8 comments
Assignees

Comments

@kiranmeduri
Copy link
Collaborator

Summary

Add support for AWS Cloud Map service-discovery with App Mesh virtual-nodes when using aws-app-mesh-controller-for-k8s.

Motivation

AWS Cloud Map is a cloud resource discovery service. With Cloud Map, you can define custom names for your application resources, and it maintains the updated location of these dynamically changing resources. This increases your application availability because your web service always discovers the most up-to-date locations of its resources. AWS App Mesh is a service mesh that provides application-level networking to make it easy for your services to communicate with each other across multiple types of compute infrastructure.

AWS App Mesh recently announced the support for AWS Cloud Map as a service-discovery option for virtual-nodes. In App Mesh *client *virtual-nodes add backends to define their service dependencies. These backends point to virtual-service that is backed by virtual-router with routes to each of the corresponding service virtual-nodes. Now for client to communicate with service, it will use the DNS specified on virtual-nodes. So prior to Cloud Map users need to add DNS records for each of the virtual-nodes. With Cloud Map users can create one DNS service for virtual-service and use attributes to subset the endpoints behind that DNS to target specific virtual-nodes. Additionally, App Mesh provides Envoy EDS that can be used by Envoy to discover endpoints for upstream cluster with minimal propagation latencies.

Currently the way to integrate Kubernetes cluster with Cloud Map is to use external-dns. However, external-dns lacks functionality to propagate attributes for endpoints (i.e. pod labels) to Cloud Map. To support App Mesh with Cloud Map aws-app-mesh-controller need to provide native support for Cloud Map.

User Stories

  1. User should be able to use AWS Cloud Map as service-discovery for pods associated with virtual-node.
  2. User should be able to use connect applications in Kubernetes cluster in AWS with services running outside of Kubernetes cluster.

Design

UX

Below we will discuss on how customers will use AWS Cloud Map in the context of aws-app-mesh-controller-for-k8s.

  • Lets assume customer has the following K8s namespace
        apiVersion: v1
        kind: Namespace
        metadata:
          labels:
            appmesh.k8s.aws/sidecarInjectorWebhook: enabled
          name: color
  • Customer creates mesh using the following spec
        apiVersion: appmesh.k8s.aws/v1beta1
        kind: Mesh
        metadata:
          name: eks-mesh
  • Customer creates a virtual-node using the following spec
        apiVersion: appmesh.k8s.aws/v1beta1
        kind: VirtualNode
        metadata:
          name: colorteller-red
          namespace: color
        spec:
          meshName: eks-mesh
          listeners:
            - portMapping:
                port: 9080
                protocol: http
          serviceDiscovery:
            cloudMap:
              serviceName: colorteller
              namespaceName: prod.svc.aws.local
  • Customer creates a virtual-service using the following spec (No Change)
        apiVersion: appmesh.k8s.aws/v1beta1
        kind: VirtualService
        metadata:
          name: colorteller.prod.svc.aws.local
          namespace: color
        spec:
          meshName: eks-mesh
          virtualRouter:
            name: colorteller-router
            listeners:
              - portMapping:
                  port: 9080
                  protocol: http
          routes:
            - name: color-route
              http:
                match:
                  prefix: /
                action:
                  weightedTargets:
                    - virtualNodeName: colorteller-red
                      weight: 1       
  • Customer then can create K8s deployment (No change)
        apiVersion: apps/v1
        kind: Deployment
        metadata:
          name: colorteller-red
          namespace: color
        spec:
          replicas: 1
          selector:
            matchLabels:
              app: colorteller
              version: red
          template:
            metadata:
              annotations:
                appmesh.k8s.aws/mesh: eks-mesh
              labels:
                app: colorteller
                version: red
            spec:
              containers:
                - name: colorteller
                  image: ${COLOR_TELLER_IMAGE}
                  ports:
                    - containerPort: 9080
                  env:
                    - name: "SERVER_PORT"
                      value: "9080"
                    - name: "COLOR"
                      value: "red"

Controller

Behind the scenes aws-app-mesh-controller will perform the following actions

  • Watches for Mesh CRD
    • Creates mesh using aws.appmesh.CreateMesh
  • Watches for VirtualNode CRD
    • Creates virtual-node using aws.appmesh.CreateVirtualNode
    • Creates cloudmap service using *aws.servicediscovery.CreateService *using the service-name and cloudmap-namespace-name.
  • Watches for pods (corresponding to K8s deployment)
    • Determine the virtual-node and mesh corresponding to the pod.
    • (Pod Added) If virtual-node uses CloudMap then register pod endpoint with corresponding cloudmap service using *aws.servicediscovery.RegisterInstances *api with attributes “appmesh.k8s.aws/virtualNode" and "appmesh.k8s.aws/mesh" set appropriately.
    • (Pod Deleted) If virtual-node uses CloudMap then deregister pod endpoint by calling aws.servicediscovery.DeregisterInstances.
  • Reconciles K8s pods and CloudMap instances periodically as updates to virtual-node take place.

Following App Mesh resources are created for above flow

  • MESH
$ aws appmesh describe-mesh \
    --mesh-name eks-mesh
{
    "mesh": {
        "meshName": "eks-mesh",
        "metadata": {
            "arn": "arn:aws:appmesh:us-west-2:1234567890:mesh/eks-mesh",
            "createdAt": 1560962735.095,
            "lastUpdatedAt": 1560962735.095,
            "uid": "7a31dd56-ef1e-44d5-b7f3-db98d109315c",
            "version": 1
        },
        "spec": {},
        "status": {
            "status": "ACTIVE"
        }
    }
}
  • VIRTUAL NODE
$ aws appmesh describe-virtual-node \
    --mesh eks-mesh \
    --virtual-node colorteller-red-color
{
    "virtualNode": {
        "meshName": "eks-mesh",
        "metadata": {
            "arn": "arn:aws:appmesh:us-west-2:1234567890:mesh/eks-mesh/virtualNode/colorteller-red-color",
            "createdAt": 1560962757.57,
            "lastUpdatedAt": 1564612052.233,
            "uid": "4e1e5288-3585-4abe-809b-5c9d9018cdf9",
            "version": 284
        },
        "spec": {
            "backends": [],
            "listeners": [
                {
                    "portMapping": {
                        "port": 9080,
                        "protocol": "http"
                    }
                }
            ],
            "serviceDiscovery": {
                "awsCloudMap": {
                    "attributes": [
                        {
                            "key": "appmesh.k8s.aws/mesh",
                            "value": "eks-mesh"
                        },
                        {
                            "key": "appmesh.k8s.aws/virtualNode",
                            "value": "colorteller-red-color"
                        }
                    ],
                    "namespaceName": "prod.svc.aws.local",
                    "roleArn": "arn:aws:iam::1234567890:role/aws-service-role/appmesh.amazonaws.com/AWSServiceRoleForAppMesh",
                    "serviceName": "colorteller"
                }
            }
        },
        "status": {
            "status": "ACTIVE"
        }
    }
}
  • VIRTUAL_ROUTER
$ aws appmesh describe-virtual-router \
    --mesh eks-mesh \
    --virtual-router colorteller-router-color
{
    "virtualRouter": {
        "meshName": "eks-mesh",
        "metadata": {
            "arn": "arn:aws:appmesh:us-west-2:1234567890:mesh/eks-mesh/virtualRouter/colorteller-router-color",
            "createdAt": 1561148137.763,
            "lastUpdatedAt": 1561148137.763,
            "uid": "5c34c86a-2dac-4b45-a99c-18389f2ea994",
            "version": 1
        },
        "spec": {
            "listeners": [
                {
                    "portMapping": {
                        "port": 9080,
                        "protocol": "http"
                    }
                }
            ]
        },
        "status": {
            "status": "ACTIVE"
        },
        "virtualRouterName": "colorteller-router-color"
    }
}
  • ROUTE
$ aws appmesh describe-route \
    --mesh $MESH_NAME \
    --virtual-router colorteller-router-color \
    --route color-rout
{
    "route": {
        "meshName": "eks-mesh",
        "metadata": {
            "arn": "arn:aws:appmesh:us-west-2:1234567890.:mesh/eks-mesh/virtualRouter/colorteller-router-color/route/color-route",
            "createdAt": 1561148137.814,
            "lastUpdatedAt": 1564613583.153,
            "uid": "a6d43813-793f-4096-900b-e5b2fb717788",
            "version": 14
        },
        "routeName": "color-route",
        "spec": {
            "httpRoute": {
                "action": {
                    "weightedTargets": [
                        {
                            "virtualNode": "colorteller-red-color",
                            "weight": 1
                        }
                    ]
                },
                "match": {
                    "prefix": "/"
                }
            }
        },
        "status": {
            "status": "ACTIVE"
        },
        "virtualRouterName": "colorteller-router-color"
    }
}
  • VIRTUAL_SERVICE
$ aws appmesh describe-virtual-service \
    --mesh eks-mesh \
    --virtual-service colorteller.prod.svc.aws.local
{
    "virtualService": {
        "meshName": "eks-mesh",
        "metadata": {
            "arn": "arn:aws:appmesh:us-west-2:1234567890:mesh/eks-mesh/virtualService/colorteller.prod.svc.aws.local",
            "createdAt": 1561148137.624,
            "lastUpdatedAt": 1561148137.911,
            "uid": "f63e9694-6b0d-4e96-80bf-996064aca29f",
            "version": 2
        },
        "spec": {
            "provider": {
                "virtualRouter": {
                    "virtualRouterName": "colorteller-router-color"
                }
            }
        },
        "status": {
            "status": "ACTIVE"
        },
        "virtualServiceName": "colorteller.prod.svc.aws.local"
    }
}

Following Cloud Map resources will be created by controller

  • SERVICE
$ aws servicediscovery get-service \
    --id srv-sxvjmn5we2fz6nqm
{
    "Service": {
        "Id": "srv-sxvjmn5we2fz6nqm",
        "Arn": "arn:aws:servicediscovery:us-west-2:1234567890:service/srv-sxvjmn5we2fz6nqm",
        "Name": "colorteller",
        "NamespaceId": "ns-omcos67xvs7tat4z",
        "DnsConfig": {
            "NamespaceId": "ns-omcos67xvs7tat4z",
            "RoutingPolicy": "MULTIVALUE",
            "DnsRecords": [
                {
                    "Type": "A",
                    "TTL": 300
                }
            ]
        },
        "CreateDate": 1564671683.11,
        "CreatorRequestId": "app-mesh-controller"
    }
}
  • INSTANCE(S)
$ aws servicediscovery discover-instances \
    --namespace prod.svc.aws.local \
    --service colorteller
{
    "Instances": [        
        {
            "InstanceId": "192.168.122.110",
            "NamespaceName": "prod.svc.aws.local",
            "ServiceName": "colorteller",
            "HealthStatus": "UNKNOWN",
            "Attributes": {
                "AWS_INSTANCE_IPV4": "192.168.122.110",
                "app": "colorteller",
                "appmesh.k8s.aws/mesh": "eks-mesh",
                "appmesh.k8s.aws/virtualNode": "colorteller-red-color",
                "k8s.io/namespace": "color",
                "k8s.io/pod": "colorteller-red-8c745484-sphc6",
                "pod-template-hash": "8c745484",
                "version": "red"
            }
        }
    ]
}

CRD Changes

  • CloudMapServiceDiscovery
        type CloudMapServiceDiscovery struct {
        -       CloudMapServiceName string `json:"cloudMapServiceName"`
        +       ServiceName   string            `json:"serviceName"`
        +       NamespaceName string            `json:"namespaceName"`
        +       Attributes    map[string]string `json:"attributes,omitempty"`
         }
  • VirtualNodeStatus
        type VirtualNodeStatus struct {
                // VirtualNodeArn is the AppMesh VirtualNode object's Amazon Resource Name
                // +optional
                VirtualNodeArn *string                `json:"virtualNodeArn,omitempty"`
        -       // CloudMapServiceArn is a CloudMap Service object's Amazon Resource Name
        +       Conditions     []VirtualNodeCondition `json:"conditions"`
        +       // CloudMapService is AWS CloudMap Service object's info
                // +optional
        -       CloudMapServiceArn *string `json:"cloudMapServiceArn,omitempty"`
        +       CloudMapService *CloudMapServiceStatus `json:"cloudmapService,omitempty"`
        +}
        +
        +// CloudMapServiceStatus is AWS CloudMap Service object's info
        +type CloudMapServiceStatus struct {
        +       // ServiceID is AWS CloudMap Service object's Id
                // +optional
        -       QueryParameters map[string]string      `json:"queryParameters,omitempty"`
        -       Conditions      []VirtualNodeCondition `json:"conditions"`
        +       ServiceID *string `json:"serviceId,omitempty"`
        +       // NamespaceID is AWS CloudMap Service object's namespace Id
        +       // +optional
        +       NamespaceID *string `json:"namespaceId,omitempty"`
         }

Future Work

  1. Automatic cleanup of Cloud Map services
  2. Automatic creation and management of Cloud Map namespaces
  3. Automatic registration of pod endpoints outside the scope of App Mesh
@kiranmeduri
Copy link
Collaborator Author

Also noticed that weaveworks/flagger vends CRD types https://github.com/weaveworks/flagger/blob/master/pkg/apis/appmesh/v1beta1/types.go. That need to be updated before we signoff on this @stefanprodan .

@stefanprodan
Copy link
Collaborator

@kiranmeduri Flagger uses Kubernetes DNS when creating the virtual nodes see https://github.com/weaveworks/flagger/blob/master/pkg/router/appmesh.go#L82. Flagger can't know what Cloud Map service and namespace to use, unless the name and namespace match Kubernetes?

I think exposing this in the Canary CRD would allow uses to opt-in for Cloud Map and set those values:

apiVersion: flagger.app/v1alpha3
kind: Canary
metadata:
  name: frontend
  namespace: test
spec:
  service:
    # App Mesh reference
    meshName: global
    # App Mesh egress (optional) 
    backends:
      - backend.test
    # App Mesh Cloud Map discovery (optional) 
    cloudMap:
      serviceName: frontend
      serviceNamespace: prod.svc.aws.local

When using Cloud Map how would Kubernetes DNS know about it? Is there a CloudMap CoreDNS plugin?

@kiranmeduri
Copy link
Collaborator Author

Flagger can continue to use native K8s DNS, however I am worried that the CRD definitions will slowly diverge from reality. Flagger should probably depend on aws-app-mesh-controller-for-k8s to get the CRD. Beyond that, current types in Flagger are not compatible with App Mesh GA APIs and AWS SDK. For e.g. VirtualRouter now requires a listener to be defined and aws-sdk checks for it. So my concern around Flagger is more around conformance with App Mesh rather than picking Cloud Map as a solution. Though that will be a bonus and requires further design on how to use it.

CoreDNS automatically delegates the resolution to VPC resolver in AWS. In my testing it works fine. However, if users are sure that they only want CloudMap based service-discovery they can do so by adding dnsPolicy, e.g.

dnsConfig:
   nameservers:
   - 169.254.169.253  # default VPC dns endpoint        
dnsPolicy: None

@stefanprodan
Copy link
Collaborator

For e.g. VirtualRouter now requires a listener to be defined and aws-sdk checks for it.

@kiranmeduri can you point me to the controller change log please? I can't find anywhere in this repo the listener reference as a breaking change.

@nckturner
Copy link
Contributor

@kiranmeduri I like the flow for your design. I prefer this to, say, creating a new CRD per AWS resource we are creating. My reading of the proposal is that the controller will create the CloudMap service for the virtual node, but expects the namespace to exist (or perhaps there is a CRD for namespace). I do like this approach, and its similar to what we do for virtual routers, but I think now is the time for us to think through deletion and sharing behavior for those resources. Currently, routers can be shared between virtual services, and an attempt to delete them is made when a virtual service custom resource is deleted. In this proposal, can virtual nodes share CloudMap services? It seems that it should be possible, as instance ID (IP) within a cluster should be unique. I'm not sure if there are benefits to allowing them to be shared, or if there are any downsides or obstacles with CloudMap service deletion workflow, like can we rely on a deletion to fail if another virtual node has registered instances in the service? (I think so)

@kiranmeduri
Copy link
Collaborator Author

For e.g. VirtualRouter now requires a listener to be defined and aws-sdk checks for it.

@kiranmeduri can you point me to the controller change log please? I can't find anywhere in this repo the listener reference as a breaking change.

#44 , #48

@kiranmeduri
Copy link
Collaborator Author

@kiranmeduri I like the flow for your design. I prefer this to, say, creating a new CRD per AWS resource we are creating. My reading of the proposal is that the controller will create the CloudMap service for the virtual node, but expects the namespace to exist (or perhaps there is a CRD for namespace). I do like this approach, and its similar to what we do for virtual routers, but I think now is the time for us to think through deletion and sharing behavior for those resources. Currently, routers can be shared between virtual services, and an attempt to delete them is made when a virtual service custom resource is deleted. In this proposal, can virtual nodes share CloudMap services? It seems that it should be possible, as instance ID (IP) within a cluster should be unique. I'm not sure if there are benefits to allowing them to be shared, or if there are any downsides or obstacles with CloudMap service deletion workflow, like can we rely on a deletion to fail if another virtual node has registered instances in the service? (I think so)

Yes virtual-nodes can and will share cloudmap services, for e.g. in the color-app, blue and green tellers will share same service say colorteller.svc.aws.local.

DeleteService in CloudMap will fail if there are any registered instances, so that should be safe. However, after a successful delete a create will revive the service with same ID and this may be undesirable. The reason is that the previous delete may not have been observed and a subsequent delete will clobber the create that interleaved.

So in theory a reconcile loop will eventually take care of fixing the services, but have to dive into some edge cases before we enable delete.

@nckturner
Copy link
Contributor

Initial Cloud Map support has been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants