Add Support for an ExternalDNS Operator #1730

danehans · 2020-08-18T16:30:53Z

What would you like to be added:
I propose adding an operator to manage ExternalDNS.

Why is this needed:
Currently, ExternalDNS is managed through a manual process (docs, manifests, etc) or external tooling (i.e. Helm). An operator would simplify the user experience by providing declarative management of ExternalDNS. Several operators (i.e. addons, seccomp, etc.) reside in the kubernetes-sigs org and many others (i.e. etcd-operator, prometheus-operator) exist in the same org as the applications they manage. The externaldns-operator and ExternalDNS would benefit by having each project reside in the same kubernetes-sigs org.

danehans · 2020-08-18T17:15:30Z

Here are a few use cases that an operator can address:

As a cluster admin, I need the ability to install ExternalDNS in my Kubernetes cluster to manage Route53 DNS records- An operator can a) verify that the necessary IAM policy/roles exist and create them if needed b) verify if the hosted zone(s) exist and create the zone(s) if needed c) deploy ExternalDNS (rbac, deployment, etc.). The reverse can also be accomplished for uninstalling ExternalDNS.
As a cluster admin, I need the ability to enforce an "approved" ExternalDNS configuration- An operator implements the k8s controller design pattern to ensure the current state matches the desired state. An operator has the ability to surface status conditions when the two states differ.
As a cluster admin, I need to ensure that ExternalDNS is working properly- An operator can programmatically create a test environment (i.e. client/server pod, a DNS record, curl client -> server FQDN, etc.) on a periodic basis to validate e2e functionality.
As a cluster admin, I need the ability to perform zero downtime upgrades of ExternalDNS. An operator can implement best practices to ensure successful zero downtime ExternalDNS upgrades for production environments.
As a cluster admin, I need to reduce the potential for breaking changes- An operator can a) expose an API to reduce configuration complexity b) be programmed to not implement a breaking change (i.e. change an arg) .

danehans · 2020-08-18T17:56:40Z

The addon operator KEP and KubeCon video provides additional background for the motivation to support an ExternalDNS operator.

szuecs · 2020-08-18T20:17:11Z

@danehans in which repository the „install operator“ should be?
For me this sounds like puppet module forge in kubernetes and I don’t believe in „you don’t have to understand what you run“.
For usability I agree that add/remove rbac and cloud iam roles can be an enhancement, but then who will configure the roles of the „install operator“?

Raffo · 2020-08-19T07:29:50Z

As I briefly mentioned on the external-dns slack, I disagree on the need for an ExternalDNS operator. Let me explain why.

ExternalDNS was designed from the very beginning to be simple and for upgrades to never be a concern. Dealing with DNS, which is a clearly very eventual consistent, any setup can tolerate brief outages of the ExternalDNS pod without problems. Also, we essentially store no state other than what is in Kubernetes already which makes rollouts of new versions not a problem.

I understand that rolling out different versions could possibly bring challenges like having to deal with possible incompatible flags, etc, but I think:

we did a decent job to make sure we didn't deprecate widely used flags till now.
the operator would have to deal with this anyway.

And here comes the question... who maintains and configure the operator then? I believe an operator would essentially just move the problem to another tool instead of solving the whole configuration problem.

I've watched the video presentation from KubeCon and read the KEP, but I think we are facing a different problem. ExternalDNS is not strictly bundled with a version of Kubernetes and any of the latest versions are widely compatible with at least 3 releases of Kubernetes which is what all cloud providers support. It's even more backward compatible than that, but this is what we make sure we can guarantee.

Moreover, I would also love to add that more often than operators, the problem that really need to be solved is the one of config management and its versioning. When ExternalDNS was started, we added it to the clusters using git as source of truth. AFAIK it is still maintained like this and it still works well (@szuecs can contradict me if I'm wrong).
Being the problem of config management something that every company needs to solve (for terraform, cloudformation, puppet or pretty much any other thing that has to deal with configuration), I think this solves implicitly the problems with rolling out ExternalDNS.

Last but not least, there is already a helm chart available for this project and we recently added support to kustomize that can help anyone get started.

Those ☝️ are the reasons why I think we don't need an operator for this project. I think it would introduce complexity rather than simplify things.
That said, I am not opposed to someone writing that code (just not in this repo, for maintainability reasons): it could prove me completely wrong, it could turn out to be useful in some cases, it might serve some companies' interests. And this is why open source is great, we can have different opinions, different implementations and learn from those.

I hope this clarifies 😃

danehans · 2020-08-20T00:19:39Z

For usability I agree that add/remove rbac and cloud iam roles can be an enhancement, but then who will configure the roles of the „install operator“?

@szuecs yes, an admin will need to kubectl apply -f /operator/config, so the rbac setup is not much different between the two. However, the operator can still provide value with the other pieces of use case 1.

@Raffo thanks for the detailed explanation on your views. The management space will continue to have different tools (helm, kustomize, operators, etc.) with each providing pros/cons. I don't see how ExternalDNS addresses each of the use cases I describe above. The other tools that you reference may be able to support these use cases, but I think the management space will continue to have different tools that compete or compliment one another. I'll start the project in my repo and provide an update to the community when it achieves the above use-cases.

szuecs · 2020-08-20T11:40:48Z

@Raffo yes for us it just works like that.

I generally completely agree with @Raffo .

@danehans additional value the operator can provide if you split responsibility in clusters to different external-dns with maybe targeting different providers. I don’t know what other users have but from slack, many people try to configure aws cross account setups or having a 3rd party provider. So I see there are some cases that could fit into your operator and would provide additional value.

danehans · 2020-08-20T15:48:00Z

split responsibility in clusters to different external-dns

@szuecs would an example be separate edns instances for managing public and private zones?

szuecs · 2020-08-22T19:09:30Z

Yes for example, but you could also for example bind a zone to a namespace iirc.

fejta-bot · 2020-11-20T19:58:23Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-12-20T20:41:58Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

raelga · 2021-01-07T16:17:00Z

/remove-lifecycle rotten

fejta-bot · 2021-04-07T16:52:08Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

raelga · 2021-04-08T07:51:16Z

/remove-lifecycle stale

sgreene570 · 2021-06-16T13:41:44Z

The OpenShift Network Edge team is proceeding with a preliminary ExternalDNS operator design outlined via an OpenShift enhancement, for anyone who may be interested. We hope to prove that an ExternalDNS operator would be worthwhile, and would ultimately like to gain some community buy-in.

sgreene570 · 2021-06-25T14:45:05Z

Also, its worth mentioning #1961, which describes how ExternalDNS could be turned into an "operator" in itself.

k8s-triage-robot · 2021-09-23T15:07:45Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

raelga · 2021-09-23T19:05:28Z

/remove-lifecycle stale

k8s-triage-robot · 2021-12-22T19:30:59Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2022-01-21T19:34:25Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2022-02-20T19:40:09Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2022-02-20T19:40:21Z

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

danehans added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 18, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 20, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 20, 2020

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 7, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 7, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 8, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 23, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 23, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 22, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 21, 2022

k8s-ci-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 21, 2022

k8s-ci-robot closed this as completed Feb 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for an ExternalDNS Operator #1730

Add Support for an ExternalDNS Operator #1730

danehans commented Aug 18, 2020 •

edited

Loading

danehans commented Aug 18, 2020

danehans commented Aug 18, 2020

szuecs commented Aug 18, 2020

Raffo commented Aug 19, 2020

danehans commented Aug 20, 2020

szuecs commented Aug 20, 2020

danehans commented Aug 20, 2020

szuecs commented Aug 22, 2020

fejta-bot commented Nov 20, 2020

fejta-bot commented Dec 20, 2020

raelga commented Jan 7, 2021

fejta-bot commented Apr 7, 2021

raelga commented Apr 8, 2021

sgreene570 commented Jun 16, 2021

sgreene570 commented Jun 25, 2021

k8s-triage-robot commented Sep 23, 2021

raelga commented Sep 23, 2021

k8s-triage-robot commented Dec 22, 2021

k8s-triage-robot commented Jan 21, 2022

k8s-triage-robot commented Feb 20, 2022

k8s-ci-robot commented Feb 20, 2022

Add Support for an ExternalDNS Operator #1730

Add Support for an ExternalDNS Operator #1730

Comments

danehans commented Aug 18, 2020 • edited Loading

danehans commented Aug 18, 2020

danehans commented Aug 18, 2020

szuecs commented Aug 18, 2020

Raffo commented Aug 19, 2020

danehans commented Aug 20, 2020

szuecs commented Aug 20, 2020

danehans commented Aug 20, 2020

szuecs commented Aug 22, 2020

fejta-bot commented Nov 20, 2020

fejta-bot commented Dec 20, 2020

raelga commented Jan 7, 2021

fejta-bot commented Apr 7, 2021

raelga commented Apr 8, 2021

sgreene570 commented Jun 16, 2021

sgreene570 commented Jun 25, 2021

k8s-triage-robot commented Sep 23, 2021

raelga commented Sep 23, 2021

k8s-triage-robot commented Dec 22, 2021

k8s-triage-robot commented Jan 21, 2022

k8s-triage-robot commented Feb 20, 2022

k8s-ci-robot commented Feb 20, 2022

danehans commented Aug 18, 2020 •

edited

Loading