Skip to content
This repository has been archived by the owner on May 22, 2020. It is now read-only.

Handling configurable machine setup/installation #664

Closed
wants to merge 21 commits into from

Conversation

kcoronado
Copy link
Contributor

What this PR does / why we need it:
This PR modifies the GCE actuator to handle configurable Kubernetes versions, OS image, and startup scripts when creating clusters and starting up machines. It creates a new machine setup configmap that contains the different configurations that the GCE actuator can handle.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
#637

Special notes for your reviewer:

Release note:


@kubernetes/kube-deploy-reviewers

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 3, 2018
@kcoronado kcoronado changed the title Installation Handling configurable machine setup/installation Apr 3, 2018
@kcoronado
Copy link
Contributor Author

/assign @krousey @jessicaochen

@kcoronado
Copy link
Contributor Author

@k8s-ci-robot
Copy link
Contributor

@kcoronado: GitHub didn't allow me to request PR reviews from the following users: k4leung4.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @rsdcastro @karan @medinatiger @k4leung4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jessicaochen
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 4, 2018
@@ -104,6 +106,14 @@ func NewMachineActuator(kubeadmToken string, machineClient client.MachineInterfa
}
}

var configWatch *machinesetup.ConfigWatch
if configListPath != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this check is necessary. machinesetup.NewConfigWatch should return an error if this is not a valid path to a file and this function should error-out approrpiately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create is the only command that has to pass a config list path, and then add and delete pass an empty string to the method. I put this check here to handle the empty string from add and delete, and create has a check to prevent an empty string. NewConfigWatch() checks that the path is valid. I can move the empty string check to NewConfigWatch() so it just returns nil (but no error) if it gets an empty string if that makes more sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Can we add TODOs to remove this when we switch to the new bootstrapping method. In that setup, we will be running a full-blown controller that will always need a config path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So would add/delete also need to pass the config path when we switch over (and I should add TODOs there too)?

@@ -42,6 +42,7 @@ var generateCmd = &cobra.Command{
},
}

// TODO(kcoronado): take a script file as a parameter instead of generating the preloaded script from the template.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove TODO?

@@ -153,20 +164,41 @@ func (gce *GCEClient) Create(cluster *clusterv1.Cluster, machine *clusterv1.Mach
return errors.New("invalid master configuration: missing Machine.Spec.Versions.Kubelet")
}

image, preloaded := gce.getImage(machine, config)
if gce.configWatch == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this every really happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the checks in the create command and NewMachineActuator() should prevent this. I'll get rid of it.

)

type ConfigWatch struct {
Path string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this field should be public.

Cluster: cluster,
Machine: machine,
Preloaded: preloaded,
metadataParams{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we switch to just passing arguments directly as parameters. I think it would be better for the function to be explicit about which fields of this struct it's going to need.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I switched it.


// A full image path must match the regex format. If it doesn't, we'll assume it's just the image name and try to get it.
// If that doesn't work, we will fall back to a default base image.
matches := regexp.MustCompile("projects/(.+)/global/images/(family/)*(.+)").FindStringSubmatch(img)
if matches == nil {
// Only the image name was specified in config, so check if it is preloaded in the project specified in config.
// Only the image name was specified in config, so check if it exists in the project specified in config.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can say that image path needs to be fully specified in the config now. And we can drop the project parameter as it's only needed for when the full path wasn't specified.

clustercommon "k8s.io/kube-deploy/cluster-api/pkg/apis/cluster/common"
clusterv1 "k8s.io/kube-deploy/cluster-api/pkg/apis/cluster/v1alpha1"
"k8s.io/kube-deploy/cluster-api/util"
"os"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These imports should be grouped differently. See https://github.com/golang/go/wiki/CodeReviewComments#imports

// map to the given Image and Metadata.
Params []ConfigParams `json:"machineParams"`

// This can either be a full projects path to an image/family,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just have this be full path.

@@ -0,0 +1,118 @@
package machinesetup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs copyright blurb

@@ -0,0 +1,243 @@
package machinesetup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs copyright blurb

@@ -0,0 +1,117 @@
/*
Copyright 2017 The Kubernetes Authors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


KUBELET_VERSION={{ .Machine.Spec.Versions.Kubelet }}
export VERSION=v${KUBELET_VERSION}
export ARCH=amd64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two seem unnecessary. If the startup script needs them, it could be there.

set -x

(
apt-get update
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spacing nit

@krousey
Copy link
Contributor

krousey commented Apr 4, 2018

Also, @maisem, could you tell @kcoronado how to regenerate the provider config code for this change?

}
foundRoles := true
for _, role := range params.Roles {
if !util.RoleContains(role, validParams.Roles) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ensures the matched config has at least all the roles in the param. However, it still matches even if the config has additional roles the param did not have.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps cover this in your unit tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a case to the test for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably should be more clear. Given my understanding of the design, this is probably a bug. As per the design, the config and the params should have the exact same roles. This is a case where there are different roles and yet the match still happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally planned to allow the configs to have more roles than the params because I intended for the config to support any combination of roles that it lists there. It would consolidate the configs so if you had multiple roles that can all use the same script, you wouldn't need a config for each combination. But I thought about it some more and it's kind of hard to reason about this situation since we only have a master and node role. I switched it to ensure it's an exact match to avoid confusion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this work is validParams.Roles was Role1, Role2 and params.Roles was Role1, Role1? There should be validation against this at higher levels, but possibly also guard against this here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified the check to account for this and added another case to the test.

params.PodCIDR = getSubnet(params.Cluster.Spec.ClusterNetwork.Pods)
params.ServiceCIDR = getSubnet(params.Cluster.Spec.ClusterNetwork.Services)
params.MasterEndpoint = getEndpoint(&params.Cluster.Status.APIEndpoints[0])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit)
Perhaps the getEndpoint method should take a clusterv1.APIEndpoint instead of a *clusterv1.APIEndpoint?


return d.CreateCluster(cluster, machines)
}
func init() {
createCmd.Flags().StringVarP(&co.Cluster, "cluster", "c", "", "cluster yaml file")
createCmd.Flags().StringVarP(&co.Machine, "machines", "m", "", "machine yaml file")
createCmd.Flags().StringVarP(&co.MachineSetup, "setup", "s", "", "machine setup configs yaml file")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit)
I would keep it as machinesetup or machinessetupconfig. setup by itself is a bit confusing.


return d.CreateCluster(cluster, machines)
}
func init() {
createCmd.Flags().StringVarP(&co.Cluster, "cluster", "c", "", "cluster yaml file")
createCmd.Flags().StringVarP(&co.Machine, "machines", "m", "", "machine yaml file")
createCmd.Flags().StringVarP(&co.MachineSetup, "setup", "s", "", "machine setup configs yaml file")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we default to machine_setup_configs.yaml ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that, but there's no default for --machines or --cluster. Is that ok?

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kcoronado
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: jessicaochen

Assign the PR to them by writing /assign @jessicaochen in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor Author

@kcoronado kcoronado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, I addressed all the comments.

@@ -153,20 +164,41 @@ func (gce *GCEClient) Create(cluster *clusterv1.Cluster, machine *clusterv1.Mach
return errors.New("invalid master configuration: missing Machine.Spec.Versions.Kubelet")
}

image, preloaded := gce.getImage(machine, config)
if gce.configWatch == nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the checks in the create command and NewMachineActuator() should prevent this. I'll get rid of it.

Cluster: cluster,
Machine: machine,
Preloaded: preloaded,
metadataParams{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I switched it.

}
foundRoles := true
for _, role := range params.Roles {
if !util.RoleContains(role, validParams.Roles) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a case to the test for it.


return d.CreateCluster(cluster, machines)
}
func init() {
createCmd.Flags().StringVarP(&co.Cluster, "cluster", "c", "", "cluster yaml file")
createCmd.Flags().StringVarP(&co.Machine, "machines", "m", "", "machine yaml file")
createCmd.Flags().StringVarP(&co.MachineSetup, "setup", "s", "", "machine setup configs yaml file")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that, but there's no default for --machines or --cluster. Is that ok?

@@ -104,6 +106,14 @@ func NewMachineActuator(kubeadmToken string, machineClient client.MachineInterfa
}
}

var configWatch *machinesetup.ConfigWatch
if configListPath != "" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So would add/delete also need to pass the config path when we switch over (and I should add TODOs there too)?


KUBELET_VERSION={{ .Machine.Spec.Versions.Kubelet }}
VERSION=v${KUBELET_VERSION}
ARCH=amd64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that ARCH should be in this. This probably belongs in the install scripts that need it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to the scripts.

}
foundRoles := true
for _, role := range params.Roles {
if !util.RoleContains(role, validParams.Roles) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably should be more clear. Given my understanding of the design, this is probably a bug. As per the design, the config and the params should have the exact same roles. This is a case where there are different roles and yet the match still happens.

}
foundRoles := true
for _, role := range params.Roles {
if !util.RoleContains(role, validParams.Roles) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this work is validParams.Roles was Role1, Role2 and params.Roles was Role1, Role1? There should be validation against this at higher levels, but possibly also guard against this here.


func rolesToMap(roles []clustercommon.MachineRole) map[clustercommon.MachineRole]int {
rolesMap := map[clustercommon.MachineRole]int{}
for _, role := range roles {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Are we expecting multiple of the same role specified? Seems like a bug to have the same role multiple times. Perhaps we should have a check for it (not in this PR) in the apiserver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it would be a bug or user error if there are duplicate roles. I asked Kris about it yesterday and he said we can add a check for it elsewhere later on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why roles was made a slice in the first place. @krousey

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Others expressed interest in having more than Node and Master roles and also having machines serve multiple roles.

if params.Versions != validParams.Versions {
continue
}
return &conf, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This returns the first match. What is the plan for making sure there are not multiple matches available in the config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be reasonable to check through all of them and return an error if multiple matches are found? And just print out the params so the user can find where the duplicates are in their configs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds Good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the check and added a test case for it.

Copy link
Contributor

@karan karan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits among other things.

@@ -147,6 +149,7 @@ spec:
args:
- --kubeconfig=/etc/kubernetes/admin.conf
- --token={{ .Token }}
- --config=/etc/machinesetup/machine_setup_configs.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call it this something other than config please. There are way too many things named config. Be specific here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I named it machinesetup to be consistent with the flag in create.go.

@@ -27,5 +27,5 @@ type GCEProviderConfig struct {
Project string `json:"project"`
Zone string `json:"zone"`
MachineType string `json:"machineType"`
Image string `json:"image"`
OS string `json:"os"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this OS image or OS name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be the name, it gets mapped to the image in the machine setup configs. I added a comment.

@@ -27,5 +27,5 @@ type GCEProviderConfig struct {
Project string `json:"project"`
Zone string `json:"zone"`
MachineType string `json:"machineType"`
Image string `json:"image"`
OS string `json:"os"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@@ -52,6 +56,8 @@ const (

UIDLabelKey = "machine-crd-uid"
BootstrapLabelKey = "boostrap"

MachineSetupConfigsFilename = "machine_setup_configs.yaml"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment about what this file contains and in what format.

@@ -653,42 +687,29 @@ func (gce *GCEClient) handleMachineError(machine *clusterv1.Machine, err *apierr
return err
}

func (gce *GCEClient) getImage(machine *clusterv1.Machine, config *gceconfig.GCEProviderConfig) (image string, isPreloaded bool) {
func (gce *GCEClient) getImagePath(img string) (imagePath string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great method for testing with unit tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but it makes a call to the GCE API to check that the image is there and I was having trouble figuring out how to mock it (otherwise I would have added the test when I refactored this method in a previous PR). Rob is working on a PR (#688) to set up a fake client so I can write a unit test for this method once that gets merged.

clusterv1 "k8s.io/kube-deploy/cluster-api/pkg/apis/cluster/v1alpha1"
)

type ConfigWatch struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document this please

path string
}

type ValidConfigs struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document this please

// map to the given Image and Metadata.
Params []ConfigParams `json:"machineParams"`

// The fully specified image path.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that include https://www.googleapis.com/compute/beta/projects/ubuntu-os-cloud/global/images/ubuntu-1604-xenial along with /projects/ubuntu-os-cloud/global/images/ubuntu-1604-xenial-v20180405?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's whatever will be found by the Google cloud images.get or images.getFromFamily API calls, so I think it would have to be ubuntu-1604-xenial-v20180405. I added a comment with examples. If the image can't be found in the project, it will default to projects/ubuntu-os-cloud/global/images/family/ubuntu-1710 (but that has to be changed to ubuntu-1604-lts in a later PR).


func rolesToMap(roles []clustercommon.MachineRole) map[clustercommon.MachineRole]int {
rolesMap := map[clustercommon.MachineRole]int{}
for _, role := range roles {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why roles was made a slice in the first place. @krousey

if params.Versions != validParams.Versions {
continue
}
return &conf, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds Good.

Copy link
Contributor

@krousey krousey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the documentation additions requested, this looks good to me.


func rolesToMap(roles []clustercommon.MachineRole) map[clustercommon.MachineRole]int {
rolesMap := map[clustercommon.MachineRole]int{}
for _, role := range roles {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Others expressed interest in having more than Node and Master roles and also having machines serve multiple roles.

@@ -0,0 +1,379 @@
items:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why this file isn't defined as a ConfigMap but just a YAML list of things? To someone just browsing the code, it would not be obvious that these key=value pairs are consumed as a ConfigMap is the cluster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I clarified a few things with Kris offline. So in this case, the configmap is just the mechanism for getting the yaml file mounted into the pod as a volume, and the controller consumes the yaml file, not the configmap object. We don't want to work with the actual configmap API object in case the machine controller is ever run on a non-kubernetes cluster. If we changed the yaml file to be configmap keys and values where the key is a string constructed from the defining params, each of the keys would become a separate file and we'd have to pull down all the files from the directory and key off of a filename, so it's easier to just work with the single file.

If someone did want to edit the machine setup configs yaml while a cluster is running, they would do it through the machine-setup configmap, so in that sense it's not obvious that that's how you'd update it. I can update the readme with documentation on what the yaml file is and how you'd go about updating the yaml in a running cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to work with the actual configmap API object in case the machine controller is ever run on a non-kubernetes cluster

Are we supporting that use case? When can the controller run if not in a k8s cluster (even bootstrapping will happen in a K8s cluster.

it's easier to just work with the single file.

That's fine. We can have everything in one file still. That's not my concern. My concern is that we are hiding this from the user and it's not obvious to them how to update these values. If this was using the ConfigMap object itself, they could simply kubectl apply it rather than mucking with documentation.

So to summarize - I'm not saying we need to split things, but rather the abstraction of ConfigMap be pulled up in this spec file rather than deep in code somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When can the controller run if not in a k8s cluster (even bootstrapping will happen in a K8s cluster.

Right now the gcp deployer is bootstrapping outside of a k8s cluster. In the future when we have real bootstrapping we can switch over to setting up a configmap from the start, but the machinesetup library is set up to handle a yaml file. If the APIServer ever goes down, we don't want to rely on the configmap object and instead work with the volume mounted yaml file.

Changing the yaml to a configmap definition would add a lot of extra logic to the gcp deployer since it would have to serialize and write out the contents to a file so the library can consume it. The gcp deployer is also just temporary until we have actual bootstrapping set up, so I don't think it's worth adding all the extra logic when it'll be replaced anyway.

it's not obvious to them how to update these values. If this was using the ConfigMap object itself, they could simply kubectl apply it rather than mucking with documentation.

Once the cluster is created and running, they can still use kubectl apply to update the configmap. If they want to edit the values before creating the cluster, they'd edit the yaml file the same way they can edit the machines or cluster yaml. Again, we can move to transparently setting up configmap in the future with real bootstrapping and it would be more obvious how to update, but I think for now just documenting what I said above would be enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree about the extra logic concern you have. Currently the matching logic and the mounting logic are complex and opaque to the user. If anything, it will simplify the configmap a whole lot more. But I don't want to block this PR on that.

Please add your explanation as a comment and a TODO with an issue as a future improvement.

Copy link
Contributor Author

@kcoronado kcoronado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed all the comments except for the unit tests for getImagePath() which is currently blocked.

@@ -147,6 +149,7 @@ spec:
args:
- --kubeconfig=/etc/kubernetes/admin.conf
- --token={{ .Token }}
- --config=/etc/machinesetup/machine_setup_configs.yaml
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I named it machinesetup to be consistent with the flag in create.go.

@@ -27,5 +27,5 @@ type GCEProviderConfig struct {
Project string `json:"project"`
Zone string `json:"zone"`
MachineType string `json:"machineType"`
Image string `json:"image"`
OS string `json:"os"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be the name, it gets mapped to the image in the machine setup configs. I added a comment.

// map to the given Image and Metadata.
Params []ConfigParams `json:"machineParams"`

// The fully specified image path.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's whatever will be found by the Google cloud images.get or images.getFromFamily API calls, so I think it would have to be ubuntu-1604-xenial-v20180405. I added a comment with examples. If the image can't be found in the project, it will default to projects/ubuntu-os-cloud/global/images/family/ubuntu-1710 (but that has to be changed to ubuntu-1604-lts in a later PR).

if params.Versions != validParams.Versions {
continue
}
return &conf, nil
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the check and added a test case for it.

@@ -653,42 +687,29 @@ func (gce *GCEClient) handleMachineError(machine *clusterv1.Machine, err *apierr
return err
}

func (gce *GCEClient) getImage(machine *clusterv1.Machine, config *gceconfig.GCEProviderConfig) (image string, isPreloaded bool) {
func (gce *GCEClient) getImagePath(img string) (imagePath string) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but it makes a call to the GCE API to check that the image is there and I was having trouble figuring out how to mock it (otherwise I would have added the test when I refactored this method in a previous PR). Rob is working on a PR (#688) to set up a fake client so I can write a unit test for this method once that gets merged.

@@ -0,0 +1,379 @@
items:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I clarified a few things with Kris offline. So in this case, the configmap is just the mechanism for getting the yaml file mounted into the pod as a volume, and the controller consumes the yaml file, not the configmap object. We don't want to work with the actual configmap API object in case the machine controller is ever run on a non-kubernetes cluster. If we changed the yaml file to be configmap keys and values where the key is a string constructed from the defining params, each of the keys would become a separate file and we'd have to pull down all the files from the directory and key off of a filename, so it's easier to just work with the single file.

If someone did want to edit the machine setup configs yaml while a cluster is running, they would do it through the machine-setup configmap, so in that sense it's not obvious that that's how you'd update it. I can update the readme with documentation on what the yaml file is and how you'd go about updating the yaml in a running cluster.

Copy link
Contributor

@jessicaochen jessicaochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jessicaochen
Copy link
Contributor

@krousey - Why does this change need migration. It seems to only touch cloud specific stuff (ie. stuff under cloud and the deployer)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-migration size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants