Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading to CoreDNS does not remove kube-dns deployments #6318

Closed
loshz opened this issue Jan 9, 2019 · 12 comments
Closed

Upgrading to CoreDNS does not remove kube-dns deployments #6318

loshz opened this issue Jan 9, 2019 · 12 comments

Comments

@loshz
Copy link

loshz commented Jan 9, 2019

1. What kops version are you running? The command kops version, will display
this information.

Version 1.11.0 (git-2c2042465)

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

1.11.6

3. What cloud provider are you using?

AWS

4. What commands did you run? What is the simplest way to reproduce this issue?

kops edit cluster

Add CoreDNS provider:

kubeDNS:
  provider: CoreDNS

Update cluster:

kops update cluster --yes

Rolling update:

kops rolling-update cluster --yes

5. What happened after the commands executed?

Everything ran successfully. However, I still see kube-dns deployments:

$ kubectl get deployments -n=kube-system
NAME                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
coredns                2         2         2            2           1d
dns-controller         1         1         1            1           1y
kube-dns               2         2         2            2           1y
kube-dns-autoscaler    1         1         1            1           1y
...

6. What did you expect to happen?

I expected kube-dns to be replaced by CoreDNS.

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

apiVersion: kops/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: 2017-12-19T15:09:49Z
  name: [REDACTED]
spec:
  additionalPolicies:
    node: |
      [{"Effect": "Allow","Action": ["route53:ChangeResourceRecordSets"],"Resource": ["arn:aws:route53:::hostedzone/*"]},{"Effect": "Allow","Action": ["route53:ListHostedZones","route53:ListResourceRecordSets"],"Resource": ["*"]}]
  api:
    loadBalancer:
      type: Public
  authorization:
    rbac: {}
  channel: stable
  cloudProvider: aws
  configBase: [REDACTED]
  docker:
    version: 18.06.1
  etcdClusters:
  - etcdMembers:
    - instanceGroup: master-eu-central-1b
      name: b
    name: main
  - etcdMembers:
    - instanceGroup: master-eu-central-1b
      name: b
    name: events
  iam:
    allowContainerRegistry: true
    legacy: false
  kubeAPIServer:
    oidcClientID: [REDACTED]
    oidcIssuerURL: https://accounts.google.com
    oidcUsernameClaim: email
  kubeDNS:
    provider: CoreDNS
  kubelet:
    anonymousAuth: false
  kubernetesApiAccess:
  - [REDACTED]
  kubernetesVersion: 1.11.6
  masterInternalName: [REDACTED]
  masterPublicName: [REDACTED]
  networkCIDR: [REDACTED]
  networking:
    weave:
      mtu: 8912
  nonMasqueradeCIDR: [REDACTED]
  sshAccess:
  - [REDACTED]
  subnets:
  - cidr: [REDACTED]
    name: eu-central-1b
    type: Private
    zone: eu-central-1b
  - cidr: [REDACTED]
    name: utility-eu-central-1b
    type: Utility
    zone: eu-central-1b
  - cidr: [REDACTED]
    name: public-eu-central-1b
    type: Public
    zone: eu-central-1b
  topology:
    bastion:
      bastionPublicName: [REDACTED]
    dns:
      type: Public
    masters: private
    nodes: private

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?

Am I correct in thinking the kube-dns deployments should have been deleted?
Is it safe to manually delete the deployments?

@joshbranham
Copy link
Contributor

joshbranham commented Jan 12, 2019

From what I can gather, you can remove the deployments. The kube-dns service used by the cluster has the label selector k8s-app: kube-dns, which lines up with the label on the CoreDNS pods, so technically you are running both at the same time which is fine. If you are migrating and testing, you could just scale the kube-dns and kube-dns-autoscaler deployments to 0 till you are ready to delete.

With that said, I am unsure if it's within the scope of this project to delete that deployment after switching to CoreDNS. My guess is not, but curious what other people think.

@loshz
Copy link
Author

loshz commented Jan 15, 2019

I can confirm that DNS still seems to be working as expected after scaling down kube-dns and kube-dns-autoscaler.

I am also unsure if it's within the scope of this project to delete the deployments, so I'll leave this open for now and see what others think.

@joshbranham
Copy link
Contributor

I see two options to make this easier:

  1. Update the documentation for switching to CoreDNS to state the above findings
  2. Update kops to output a note to the user upon switching the kubeDNS field

I think 1 is better, rather than having lots of printing for various conditions in the code.

@integrii
Copy link
Contributor

I agree with @joshphp - the upgrade process for coredns is largely undocumented. This is fine for just deploying a cluster, but people running production systems would appreciate more information on how coredns is upgraded.

It sounds like this is as simple as adding something to the docs that says:

If you are upgrading to coredns, kube-dns will be left in place and must be removed manually.

@jmthvt
Copy link
Contributor

jmthvt commented Jan 17, 2019

I read everywhere that we should scale kube-dns-autoscaler to 0 or remove it - however that implies that coredns can't autoscale? Shouldn't we configure the dns-autoscaler to scale coredns instead?

@cristian-marin-hs
Copy link

@jeyglk yes, you can configure the kube-dns-autoscaler to link to the coredns deployment. You just need to edit the autoscaler container command to use --target=Deployment/coredns instead of --target=Deployment/kube-dns

@loshz
Copy link
Author

loshz commented Jan 17, 2019

The coredns/deployment repo also has a good document on scaling: https://github.com/coredns/deployment/blob/master/kubernetes/Scaling_CoreDNS.md

@leoskyrocker
Copy link

leoskyrocker commented Feb 27, 2020

Shouldn't we configure the dns-autoscaler to scale coredns instead?

@jeyglk After editing our kops cluster to use CoreDNS, I observed that it already created an autoscaler deployment called "coredns-autoscaler", in addition to the pre-existing kube-dns-autoscaler. It seems that there's no need to modify the existing autoscaler rather than just removing it.

@demisx
Copy link

demisx commented Apr 30, 2021

What about the kube-dns service and the rest. Should these be manually removed as well when upgrading to kops v1.20?

kube-system       service/kube-dns                                         ClusterIP      100.64.0.10      <none>     53/UDP,53/TCP,9153/TCP         353d
kube-system       configmap/kube-dns-autoscaler                              1      353d
kube-system       secret/kube-dns-autoscaler-token-8tlgs               kubernetes.io/service-account-token   3      353d
kube-system       secret/kube-dns-token-rcs8z                          kubernetes.io/service-account-token   3      353d
kube-system       serviceaccount/kube-dns                                    1         353d
kube-system       serviceaccount/kube-dns-autoscaler                         1         353d

@dmcnaught
Copy link
Contributor

dmcnaught commented May 1, 2021

See the note at the bottom of this section (Saying the service should be left). I found that to be true when doing this upgrade - removing the service caused DNS to fail.
https://github.com/kubernetes/kops/blob/master/docs/cluster_spec.md#kubedns

@demisx
Copy link

demisx commented May 1, 2021

@dmcnaught Thank you for pointing it out. I am leaving the service in place then. I also see the doc still says:

If you would like to continue autoscaling, update the kube-dns-autoscaler Deployment container command for --target=Deployment/kube-dns to be --target=Deployment/coredns.

I've deleted the kube-dns deployments altogether with:

kubectl delete deploy/kube-dns -n kube-system
kubectl delete deploy/kube-dns-autoscaler -n kube-system

I did not update kube-dns-autoscaler since there is already coredns-autoscaler running. I thought one replaces another.

$ kubectl get deploy -n kube-system
NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
calico-kube-controllers     1/1     1            1           464d
coredns                     2/2     2            2           152m
coredns-autoscaler          1/1     1            1           152m
...

Is this OK? So far, I haven't noticed any issues with the cluster.

@joshbranham
Copy link
Contributor

Yeah back when these docs were updated CoreDNS did not have a standalone auto scaler deployed. We should likely update those to reflect the current reality of just deleting the kube-dns deployments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants