Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Headless hostPort services #324

Merged
merged 5 commits into from
Nov 20, 2017

Conversation

Arttii
Copy link
Contributor

@Arttii Arttii commented Aug 29, 2017

A simple implementation for #315 and based on #278.

The idea is, to assign a DNS address for every Pod running under a Headless service. This is a fairly important usecase in my mind, for exposing Stateful services, which have some kind of inbuilt discovery mechanism, like Apache Kafka. As these services can only really properly work with a HostPort(I did not find any other way), it makes sense to export their HostIP as a stable DNS record.

I thought about a number of approaches to parametrize this, but the simple approach was to add a subdomain annotation. Conceptually we would want for each Pod appear with the hostname assigned to it by the StatefulSet, so we only need to concerned with the subdomain where the Pods are registered.

So I added a new annotation external-dns.alpha.kubernetes.io/headlessDomain, which is basically the subdomain, i.e .example.org. I was unsure about adding the . in code or in the annotation.

It does not address generating a SRV record and simply generates A records for every running Pod, with the HostIP.

This is my first time coding against the Kubernetes API or in GO, so any comments about structure/style and other mistakes are welcome. I am also not 100% sure the approach I took is perfect, so comments are welcome.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 29, 2017
@linki
Copy link
Member

linki commented Aug 30, 2017

@Arttii Thanks for your contribution. We'll have a look.

@jrnt30
Copy link
Contributor

jrnt30 commented Aug 30, 2017

@Arttii I was looking to tackle something similar, so thanks!

I was thinking that perhaps the existing annotation could be used and try and model the implementation closer to the K8 handling of Headless Services. With Headless services right now with K8 DNS we get a record for the service itself and also DNS records per instance of the matched pods as subdomain entries of the service's domain.

Ex. A Headless service deployed as kafka-broker with a 3-replica StatefulSet named kafka provides DNS lookups of <statefulset-name>-<offset>.<service name>:

  • kafka-0.kafka-broker.default.svc.cluster.local
  • kafka-2.kafka-broker.default.svc.cluster.local
  • kafka-3.kafka-broker.default.svc.cluster.local

(I believe this does require the serviceName to be set on the StatefulSet's spec.serviceName)

Dig for single pod (note: subdomain entry of service's fqdn)

18:03 $ dig kafka-1.kafka-broker.default.svc.cluster.local

...

;; ANSWER SECTION:
kafka-1.kafka-broker.default.svc.cluster.local. 30 IN A 10.46.0.3

...

Future:

Right now the multi-value A records wouldn't be supported by the Endpoint model, so I would recommend still skipping that. But what we could then do with the same External DNS anntoation would be to have the Multi-entry A record refelect the values that it would in K8 DNS iteslef.

Dig for the service

18:03 $ dig kafka-broker.default.svc.cluster.local

...

;; ANSWER SECTION:
kafka-broker.default.svc.cluster.local. 30 IN A 10.44.0.5
kafka-broker.default.svc.cluster.local. 30 IN A 10.46.0.3
kafka-broker.default.svc.cluster.local. 30 IN A 10.46.128.6

...

Proposed Changes:

  • Instead of using new annotation, we could reuse existing annotation to express the FQDN of the service itself and keep that logic the same. (In the future this would be a multi-A record, for now it is not created at all).
  • Use your existing logic and append the individual pods, but instead reuse the existing annotation's value

Kafka Itself...unrelated

On a completely unrelated note, we tackled the Kafka static IP problem a bit differently/hackily using a ClusterIP service per member of the statefulset. The pods themselves have an init container that introspects the hostname add adds label to itself that allows the kafka-<N> service to differentiate which of the N kafka nodes it should have a label selector for. This was our approach for "Static IPs" in K8 and has worked well for Zookeeper and Kafka clients that have notorious problems with IP caching. Email me or something if you want that info. I'll try and write a blog post about it soon-ish.

@Arttii
Copy link
Contributor Author

Arttii commented Aug 31, 2017

Hi,

Thanks for taking a look.

I'm going to rephrase to check my understanding. Do you mean we basically append the POD ids to the FQDN specified in the existing Annotation and then later also add all the POD's to the mutli-value record?

I guess we could detect the headless service based on the ClusterIP being None in that case?

EDIT. I thought about it, and I like your proposal more. I implemented it. Please take a look.

@jrnt30
Copy link
Contributor

jrnt30 commented Aug 31, 2017

Yep. You hit it directly on the head with what I was thinking.

log.Debugf("Pod %s is not in running phase", v.Spec.Hostname)
}
// Get all the Pods
if pods, err := sc.client.CoreV1().Pods(svc.Namespace).List(metav1.ListOptions{LabelSelector: labels.Set(svc.Spec.Selector).AsSelectorPreValidated().String()}); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spirit of the original request around convention and suggestion. I personally have not see this style of expression used much, where the else is referencing the result of the assignment in the guarded if statement and it took me a few seconds to figure out where that was even coming from.

Personally I find it more clear to simply make the assignment and have a small err != nil check that short circuits which eventually you get used to reading all over the place.

That's a completely arbitrary assessment though and may not be true to those who have worked and looked at larger code bases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I will take a look shortly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean something like this?

pods, err := sc.client.CoreV1().Pods(svc.Namespace).List(metav1.ListOptions{LabelSelector: labels.Set(svc.Spec.Selector).AsSelectorPreValidated().String()})
	// Get all the Pods
	if err != nil {
		log.Errorf("List Pods of service[%s] error:%v", svc.GetName(), err)
	} else {}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can have an empty return to short circuit: https://play.golang.org/p/jpRxJ19fl5

Ex.

pods, err := sc.client.CoreV1().Pods(svc.Namespace).List(metav1.ListOptions{LabelSelector: labels.Set(svc.Spec.Selector).AsSelectorPreValidated().String()}); 

if err != nil {
 	log.Errorf("List Pods of service[%s] error:%v", svc.GetName(), err)
 	return 
} 

for _, v := range pods.Items {
	headlessDomain := v.Spec.Hostname + "." + hostname
	log.Debugf("Generating matching endpoint %s with HostIP %s", headlessDomain, v.Status.HostIP)
	// To reduce traffice on the DNS API only add record for running Pods. Good Idea?
	if v.Status.Phase == v1.PodRunning {
		endpoints = append(endpoints, endpoint.NewEndpoint(headlessDomain, v.Status.HostIP, endpoint.RecordTypeA))
	} else {
		log.Debugf("Pod %s is not in running phase", v.Spec.Hostname)
	}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the function should return the Endpoints no? Or should we redesign that part?

@tamalsaha
Copy link
Contributor

Hi folks, any way we can help it get merged? We need this feature.

@szuecs
Copy link
Contributor

szuecs commented Nov 14, 2017

👍 @linki @ideahitme please merge

@hjacobs hjacobs merged commit cbc539f into kubernetes-sigs:master Nov 20, 2017
ffledgling pushed a commit to ffledgling/external-dns that referenced this pull request Jan 18, 2018
* Added initial support for Headless services

* service.go: Fixed some formatting

* forgot to run gometalinter

* service: implemented alternative proposal, to reuse existing annotation

* source: some refactoring
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants