-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for AWS Cloud Map as service-discovery #53
Conversation
1027e64
to
444d6e3
Compare
- Updated virtual-node controller to populate CloudMap service-discovery and create service in CloudMap - New pod controller to register/deregister endpoints behind cloudmap service
AttrK8sPod = "k8s.io/pod" | ||
//AttrK8sNamespace is a custom attribute injected by app-mesh controller | ||
AttrK8sNamespace = "k8s.io/namespace" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some valuable instance (cloudmap) instance metadata would include:
- Region & AZ
- Subnet
- VPC
- Cluster Name
- EC2 instance ID
- Owners? I.e. following owner refs to get deployment/etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-discovery.html
👍 for REGION and AVAILABILITY_ZONE. This will be helpful for AZ-aware routing: aws/aws-app-mesh-roadmap#94
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes they are useful. Should we look into adding them as labels to pods and let register instance take over from there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see those labels being useful for pods--how does the cluster admin decide which pods get them. All pods that are injected for App Mesh? We could do this at inject time, which involves adding API calls into the pod startup path (but is probably fine), or we could do it late by the controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now we can lookup the node corresponding to pod and get that information. Below is the information from node. However, controller needs permissions to get node info. We can follow up this work in upcoming versions.
labels:
alpha.eksctl.io/cluster-name: cluster-20190821
alpha.eksctl.io/instance-id: i-08dd903c3cbe5db66
alpha.eksctl.io/nodegroup-name: m5-l-1
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/instance-type: m5.large
beta.kubernetes.io/os: linux
failure-domain.beta.kubernetes.io/region: us-west-2
failure-domain.beta.kubernetes.io/zone: us-west-2a
kubernetes.io/hostname: ip-192-168-118-179.us-west-2.compute.internal
role: workers
var meshName string | ||
var virtualNodeName string | ||
|
||
//TODO: Remove this hack and always expect proper annotations to be injected into the pod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving this comment as a reminder for one of us to create an issue in inject.
c.handleServiceDiscovery(ctx, vnode, copyForUpdate) | ||
} | ||
|
||
//TODO: delete unused cloudmap services |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving comment as a reminder to discuss this/create issue.
|
||
//It is okay to call Create multiple times for same service-name. | ||
//It is also cheaper than calling get and then figuring out to create. | ||
cloudmapService, err := c.cloud.CloudMapCreateService(ctx, cloudmapConfig, c.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to handle the case where the cloudmap namespace doesn't exist? I guess its probably evident with the error returned from the cloudmap create service call, which is already logged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably should but creating namespace on the fly seems bad. CloudMap namespace should be created by admin, same can be said of mesh though. For now I will add a TODO to improve the experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a situation where things don't reconcile properly, or we are ignoring my namespace for some reason. I had example items created before updating to a controller built from this branch. I updated to the new controller, modified a virtual node to specify service discovery == cloudmap. The controller prints cloudmap.go:273] No namespace found with name color-mesh
, and then even after creating the cloudmap namespace, and redeploying the controller, I'm seeing the same error.
Update: this was just me not having the correct cloudmap permissions on my nodes, which I realized after far too much debugging.
DnsConfig: &servicediscovery.DnsConfig{ | ||
NamespaceId: awssdk.String(namespaceSummary.NamespaceID), | ||
RoutingPolicy: awssdk.String(servicediscovery.RoutingPolicyMultivalue), | ||
DnsRecords: []*servicediscovery.DnsRecord{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our meeting with Cloud Map, it was recommended that if we are creating namespaces for users we should default to HTTP only and allow the user to request DNS. Although we aren't creating namespaces for users, if the namespace that the user provides is not DNS enabled, is this field ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to add this conditionally, I am getting this error:
E0903 15:44:25.848882 1 controller.go:443] error syncing 'appmesh-demo/colorteller-red': Error handling cloudmap service discovery for virtual node colorteller-red-appmesh-demo: InvalidInput: When you create a service using a namespace that has a type of HTTP, you can't include a DnsConfig element.
status code: 400, request id: 8c76f6af-1c79-4aa7-a24a-a2b2c3de0e5b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HTTP namespace will not work with AppMesh because those service-names cannot be resolved by VPC DNS. In the current setup, application container still resolves the name using DNS resolver which gives an endpoint that is ignored by Envoy and further resolved. envoyproxy/envoy#6748 will help in avoiding DNS altogether in future. Until then we should be using PrivateDnsNamespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok we can call this out as its a little confusing. Arguably, though, as long as the application gets some DNS response, then does everything work fine? Whether its kube dns, route53, or some custom DNS solution the customer is running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as long as the application gets some DNS response, then does everything work fine
Yeah that is case right now. We're also looking into adding some work-around we can give customers since some people may not be allowed to create a PrivateDnsNamespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated PR with better error message when provided namespace is non DNS_PRIVATE. Marked it with a TODO to support HTTP namespaces. That should unblock this PR.
E0905 19:34:36.794398 1 controller.go:443] error syncing 'howto-k8s-cloudmap/colorapp-red': Error handling cloudmap service discovery for virtual node colorapp-red-howto-k8s-cloudmap: Cannot create service under namespace howto-k8s-cloudmap.pvt.aws.local with type HTTP, only namespaces with type DNS_PRIVATE are supported
I will fix the error handling. We should also update documentation so that users do not trip on this. Also worth to look into https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html |
Issue #, if available:
#52
Description of changes:
Add support for AWS Cloud Map as service-discovery
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.