Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use kubenet instead of weavenet for GCP provider #107

Merged
merged 11 commits into from
Apr 30, 2018

Conversation

kcoronado
Copy link
Contributor

@kcoronado kcoronado commented Apr 23, 2018

What this PR does / why we need it:
This PR modifies the GCP provider to create clusters that use kubenet instead of weave. Clusters are not being created successfully (create commands would complete and spin up a master, but no nodes), so this change will spin up a working master and node(s).

Specific changes:

  • Add the default service account to new master VM instances which is needed for the Kubernetes GCE cloud provider code.
  • Allow IP forwarding to enable pod to pod communication.
  • Add a function to the startup scripts to prevent docker from starting on installation so we can configure DOCKER_OPTS and then start it manually. This is needed to prevent it from messing with the IP tables.
  • Create and set bridge-nf-call-iptables to 1 to pass the kubeadm preflight check. Everything would get set up fine with ignoring the warnings, but this workaround makes sure we don't accidentally ignore important warnings/errors in the future.
  • Remove --pod-cidr flag from KUBELET_NETWORK_ARGS to fix issue with assigning conflicting IPs to pods.
  • Update machines.yaml.template to use the kubenet configs in machine_setup_configs.yaml.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
#81
Fixes #83
Fixes #115

Special notes for your reviewer:

Release note:

NONE

@kubernetes/kube-deploy-reviewers

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 23, 2018
@k8s-ci-robot
Copy link
Contributor

Hi @kcoronado. Thanks for your PR.

I'm waiting for a kubernetes or kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 23, 2018
@kcoronado
Copy link
Contributor Author

/assign @krousey @jessicaochen

@kcoronado
Copy link
Contributor Author

/cc @maisem

@k8s-ci-robot
Copy link
Contributor

@kcoronado: GitHub didn't allow me to request PR reviews from the following users: maisem.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @maisem

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@jessicaochen jessicaochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly questions to understand the what and why of this review.

If create cluster commands now fails, does that mean the config map PR broke cluster creation? If so, I would want to bump an issue tracking having a presubmit that actually creates a cluster.

@@ -65,6 +65,10 @@ func (gce *GCEClient) CreateMachineControllerServiceAccount(cluster *clusterv1.C
if err != nil {
return fmt.Errorf("couldn't grant permissions to service account: %v", err)
}
err = run("gcloud", "projects", "add-iam-policy-binding", project, "--member=serviceAccount:"+email, "--role=roles/iam.serviceAccountActor")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need a bit of context. Why does the machine controller service account need to impersonate other accounts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we got the master set up, there was an error message in the machine controller saying it needed access to the default service account when it was trying to create the node, so I added it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also confused by this. Why does the machine controller service account need to be able to impersonate other service accounts? Shouldn't it just need the appropriate scopes to do what it needs and that's it? Or is this the permission that lets it create instances with service accounts?

Also, iam.serviceAccountActor has been deprecated. https://cloud.google.com/iam/docs/service-accounts#the_service_account_actor_role

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it looks like this is the permission to create instances with service accounts. Since I removed the service account from new worker nodes, I can get rid of this role. This is assuming we aren't going to have the machine controller spin up new masters, though.

},
Labels: labels,
ServiceAccounts: []*compute.ServiceAccount{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some context, why do we need a specified service account now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was some error in the API server when spinning up the cluster that said it couldn't fetch the initial token and that it couldn't find the default service account. I don't remember the error message exactly, but it was preventing the API server from starting. I think @mkjelland was working on replacing this default service account with something more specific, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think every node needs this account. Just the masters since they are the ones that will be making GCE API calls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need the service account because the controller manager will be using the GCE api to handle load balancers and pod CIDR routing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a check so only masters get the service account added.

@@ -256,10 +256,15 @@ func (gce *GCEClient) Create(cluster *clusterv1.Cluster, machine *clusterv1.Mach
if gce.machineClient == nil {
labels[BootstrapLabelKey] = "true"
}
tags := []string{"https-server"}
if !util.IsMaster(machine) {
tags = append(tags, fmt.Sprintf("%s-worker", cluster.Name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What uses this %s-worker tag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nodes need this %s-worker tag because the firewall rules that GCP creates for LoadBalancers are applied to the node tag (which is specified in the /etc/kubernetes/cloud-config file).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we apply this tag to nodes via gce but masters via the startup script? Why can't nodes and masters use the same method of getting the tag applied?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both nodes and masters get tags applied via GCE. They both get the https-server tag but only nodes get %s-worker. The master startup script creates a cloud-config file where it specifies the node-tags as %s-worker, but that doesn't add the tag to the master. I think the cloud-config is just for GCP to know how to create the firewall rules and maybe do some other networking things. I'm not 100% sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If nodeTags represent the machines that should have firewall opened for load balancers, the master machine should be part of that set. In open source, masters are schedulable nodes.

https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/gce/gce.go#L128
nodeTags []string // List of tags to use on firewall rules for load balancers

https://github.com/kubernetes/kubernetes/blob/f49f799dbda6ce47d5d5709b73bede68d3ccde0f/pkg/cloudprovider/providers/gce/gce_loadbalancer_external.go#L952
TargetTags: hostTags, Allowed: []*compute.FirewallAllowed{ { // TODO: Make this more generic. Currently this method is only // used to create firewall rules for loadbalancers, which have // exactly one protocol, so we can never end up with a list of // mixed TCP and UDP ports. It should be possible to use a // single firewall rule for both a TCP and UDP lb. IPProtocol: strings.ToLower(string(ports[0].Protocol)), Ports: allowedPorts, },

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add the tag to the master. This change will be moved to a new PR though.

Copy link
Contributor Author

@kcoronado kcoronado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer your question, I think the config map PR broke cluster creation. It has been a while since I started working on that though so I don't totally remember. But before that PR, I don't think I was ever able to spin up a node after the master was created. So the cluster create command executed successfully, but the cluster wasn't completely functional. I think it would be a good idea to have a presubmit for creating a cluster.

@@ -256,10 +256,15 @@ func (gce *GCEClient) Create(cluster *clusterv1.Cluster, machine *clusterv1.Mach
if gce.machineClient == nil {
labels[BootstrapLabelKey] = "true"
}
tags := []string{"https-server"}
if !util.IsMaster(machine) {
tags = append(tags, fmt.Sprintf("%s-worker", cluster.Name))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nodes need this %s-worker tag because the firewall rules that GCP creates for LoadBalancers are applied to the node tag (which is specified in the /etc/kubernetes/cloud-config file).

},
Labels: labels,
ServiceAccounts: []*compute.ServiceAccount{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was some error in the API server when spinning up the cluster that said it couldn't fetch the initial token and that it couldn't find the default service account. I don't remember the error message exactly, but it was preventing the API server from starting. I think @mkjelland was working on replacing this default service account with something more specific, though.

@@ -65,6 +65,10 @@ func (gce *GCEClient) CreateMachineControllerServiceAccount(cluster *clusterv1.C
if err != nil {
return fmt.Errorf("couldn't grant permissions to service account: %v", err)
}
err = run("gcloud", "projects", "add-iam-policy-binding", project, "--member=serviceAccount:"+email, "--role=roles/iam.serviceAccountActor")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we got the master set up, there was an error message in the machine controller saying it needed access to the default service account when it was trying to create the node, so I added it here.

@@ -256,10 +256,15 @@ func (gce *GCEClient) Create(cluster *clusterv1.Cluster, machine *clusterv1.Mach
if gce.machineClient == nil {
labels[BootstrapLabelKey] = "true"
}
tags := []string{"https-server"}
if !util.IsMaster(machine) {
tags = append(tags, fmt.Sprintf("%s-worker", cluster.Name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we apply this tag to nodes via gce but masters via the startup script? Why can't nodes and masters use the same method of getting the tag applied?

@@ -256,10 +256,15 @@ func (gce *GCEClient) Create(cluster *clusterv1.Cluster, machine *clusterv1.Mach
if gce.machineClient == nil {
labels[BootstrapLabelKey] = "true"
}
tags := []string{"https-server"}
if !util.IsMaster(machine) {
tags = append(tags, fmt.Sprintf("%s-worker", cluster.Name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If nodeTags represent the machines that should have firewall opened for load balancers, the master machine should be part of that set. In open source, masters are schedulable nodes.

https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/gce/gce.go#L128
nodeTags []string // List of tags to use on firewall rules for load balancers

https://github.com/kubernetes/kubernetes/blob/f49f799dbda6ce47d5d5709b73bede68d3ccde0f/pkg/cloudprovider/providers/gce/gce_loadbalancer_external.go#L952
TargetTags: hostTags, Allowed: []*compute.FirewallAllowed{ { // TODO: Make this more generic. Currently this method is only // used to create firewall rules for loadbalancers, which have // exactly one protocol, so we can never end up with a list of // mixed TCP and UDP ports. It should be possible to use a // single firewall rule for both a TCP and UDP lb. IPProtocol: strings.ToLower(string(ports[0].Protocol)), Ports: allowedPorts, },

@pipejakob
Copy link
Contributor

To expand on @jessicaochen's questions, this PR's title and description state one objective, without much context: switching the network provider from weave-net to kubenet. Looking at the diff, though, the actual changes are at least:

  • Add dynamic tags to GCE worker instances
  • Enable IpForwarding on all GCE instances
  • Add a ServiceAccount to all GCE instances
  • Grant impersonation permissions to GCE machine-controller service account
  • Fix cluster creation after Handling configurable machine setup/installation #104 broke it
  • ... many more subtle changes?

There is a lot bundled up in this single PR! I think it would make the changes much easier to reason about if they were broken up into smaller, single-purpose diffs with clear rationales for why the changes are desirable.

It also sounds like the very first PR should probably contain just the changes needed to fix any regressions introduced by #104, which probably does not require changing the network provider at all. Does that sound reasonable?

echo "exit 101" > /usr/sbin/policy-rc.d
chmod +x /usr/sbin/policy-rc.d
trap "rm /usr/sbin/policy-rc.d" RETURN
apt install -y docker.io
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there's an alias I didn't notice, this looks broken. Shouldn't the command be apt-get, not apt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, @krousey schooled me. Apparently apt does work now and I am just an old person clinging to the past. Might be worthwhile to standardize on one or the other, though, because this file now uses a mixture of them both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I changed it to apt-get.

},
Labels: labels,
ServiceAccounts: []*compute.ServiceAccount{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think every node needs this account. Just the masters since they are the ones that will be making GCE API calls.

},
Labels: labels,
ServiceAccounts: []*compute.ServiceAccount{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need the service account because the controller manager will be using the GCE api to handle load balancers and pod CIDR routing.

@@ -65,6 +65,10 @@ func (gce *GCEClient) CreateMachineControllerServiceAccount(cluster *clusterv1.C
if err != nil {
return fmt.Errorf("couldn't grant permissions to service account: %v", err)
}
err = run("gcloud", "projects", "add-iam-policy-binding", project, "--member=serviceAccount:"+email, "--role=roles/iam.serviceAccountActor")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also confused by this. Why does the machine controller service account need to be able to impersonate other service accounts? Shouldn't it just need the appropriate scopes to do what it needs and that's it? Or is this the permission that lets it create instances with service accounts?

Also, iam.serviceAccountActor has been deprecated. https://cloud.google.com/iam/docs/service-accounts#the_service_account_actor_role

apt-transport-https \
cloud-utils \
prips

function install_configure_docker () {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't similar changes be required for the nodes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the change in for nodes.

@kcoronado
Copy link
Contributor Author

@pipejakob turns out cluster creation was not as broken as I thought it was using weavenet. It creates the master and says it creates the node, but the node doesn't exist. The command runs to completion though. I double checked what was happening before #104 was merged and this is the same functionality as before, so there weren't any regressions (I'll update the description). I must have just messed things up myself when I was testing it.

In terms of all the changes, I agree there are several things going on. I can split out the dynamic tags for GCE workers since that's adding load balancing functionality, and this doesn't prevent the cluster from starting successfully. I don't think it makes as much sense to break the rest of the changes up since they're all changes needed to fix errors when starting a cluster with kubenet. I listed the error/reason for most of the changes in the first commit in this PR, but I could move them to the PR description for better visibility. Would that work?

- add a worker tag to worker nodes so firewall rules created for
loadbalancers will be applied to the worker nodes.
- add the default service account to new VM instances.
- add serviceAccountActor role to the machine controller service account
so the master can spin up new nodes.
- add a function to the startup scripts to prevent docker from starting
on installation so we can configure DOCKER_OPTS and then start it
manually. This is needed to prevent it from messing with the IP tables.
The flag was replaced with a workaround to create the missing file that
triggered the error. The cluster was functioning when all preflight
errors were ignored, but in case another error comes up in the future,
we don't want to accidentally ignore it.
These changes will be in a separate PR.
Copy link
Contributor Author

@kcoronado kcoronado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed all the comments. I'll update the PR description with what changes were added and why.

},
Labels: labels,
ServiceAccounts: []*compute.ServiceAccount{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a check so only masters get the service account added.

echo "exit 101" > /usr/sbin/policy-rc.d
chmod +x /usr/sbin/policy-rc.d
trap "rm /usr/sbin/policy-rc.d" RETURN
apt install -y docker.io
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I changed it to apt-get.

@@ -65,6 +65,10 @@ func (gce *GCEClient) CreateMachineControllerServiceAccount(cluster *clusterv1.C
if err != nil {
return fmt.Errorf("couldn't grant permissions to service account: %v", err)
}
err = run("gcloud", "projects", "add-iam-policy-binding", project, "--member=serviceAccount:"+email, "--role=roles/iam.serviceAccountActor")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it looks like this is the permission to create instances with service accounts. Since I removed the service account from new worker nodes, I can get rid of this role. This is assuming we aren't going to have the machine controller spin up new masters, though.

@@ -256,10 +256,15 @@ func (gce *GCEClient) Create(cluster *clusterv1.Cluster, machine *clusterv1.Mach
if gce.machineClient == nil {
labels[BootstrapLabelKey] = "true"
}
tags := []string{"https-server"}
if !util.IsMaster(machine) {
tags = append(tags, fmt.Sprintf("%s-worker", cluster.Name))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add the tag to the master. This change will be moved to a new PR though.

apt-transport-https \
cloud-utils \
prips

function install_configure_docker () {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the change in for nodes.

@@ -266,10 +266,21 @@ func (gce *GCEClient) Create(cluster *clusterv1.Cluster, machine *clusterv1.Mach
if gce.machineClient == nil {
labels[BootstrapLabelKey] = "true"
}
serviceAccounts := []*compute.ServiceAccount{nil}
if util.IsMaster(machine) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assuming that the service account is needed for the machine controller. The assumption here is that the machine controller only runs on the master. Could we call out that assumption in a comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually needed for kubernetes' cloud provider specific code. K8s needs to be about to read compute API for faster node deletion, create advanced routes for pod CIDRs on nodes, etc...

@krousey
Copy link
Contributor

krousey commented Apr 30, 2018

lgtm from me after you add the comment @jessicaochen wanted.

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Apr 30, 2018
Copy link
Contributor

@jessicaochen jessicaochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

/ok-to-test
/lgtm
/hold

Remove hold after you have done a last E2E validation.

@jessicaochen
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 30, 2018
@kcoronado
Copy link
Contributor Author

rebuilt everything and created a cluster successfully.
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 30, 2018
@jessicaochen
Copy link
Contributor

/approved

@jessicaochen
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jessicaochen, kcoronado

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 30, 2018
@k8s-ci-robot k8s-ci-robot merged commit ec53b5b into kubernetes-sigs:master Apr 30, 2018
@kcoronado kcoronado deleted the kubenet branch April 30, 2018 19:57
chuckha pushed a commit to chuckha/cluster-api that referenced this pull request Oct 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GCE Machine Deployer Cannot Create/Access GCE VMs Weave assigns conflicting IPs to Pods
5 participants