Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to configure disk size and type for GCP clusters (Kube Deploy 638) #77

Merged
merged 1 commit into from
May 17, 2018

Conversation

spew
Copy link
Contributor

@spew spew commented Apr 17, 2018

What this PR does / why we need it:
This PR enables administrators using gcp-deployer to customize the size and types of the disks in their clusters.

Special notes for your reviewer:
Please ignore the first commit, a5bcde4 (Refactor GCEClient: wrap compute.Service in an interface for mocking GCP compute), it is a prerequisite commit and has its own PR open here: #73.

Release note:

Add the ability to customize the size and type of disks of GCP compute nodes.

@kubernetes/kube-deploy-reviewers

@spew spew requested a review from krousey April 17, 2018 17:00
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 17, 2018
@k8s-ci-robot
Copy link
Contributor

Hi @spew. Thanks for your PR.

I'm waiting for a kubernetes or kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 17, 2018
@krousey
Copy link
Contributor

krousey commented Apr 17, 2018

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 17, 2018
@krousey
Copy link
Contributor

krousey commented Apr 17, 2018

/test all

@spew spew force-pushed the kube-deploy-638 branch from 44cb101 to 5f6bd2e Compare April 17, 2018 21:31
@spew spew force-pushed the kube-deploy-638 branch from 5f6bd2e to 0b6aaf3 Compare April 30, 2018 17:30
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 30, 2018
@spew
Copy link
Contributor Author

spew commented Apr 30, 2018

Rebased on top of the latest.

@spew
Copy link
Contributor Author

spew commented Apr 30, 2018

/retest

@spew
Copy link
Contributor Author

spew commented Apr 30, 2018

/test all

@spew spew force-pushed the kube-deploy-638 branch from 0b6aaf3 to a383c56 Compare April 30, 2018 17:44
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: spew
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: jessicaochen

Assign the PR to them by writing /assign @jessicaochen in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

for idx, disk := range config.Disks {
diskSizeGb := disk.InitializeParams.DiskSizeGb
if diskSizeGb < minDiskSizeGb {
diskSizeGb = minDiskSizeGb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to spit out a warning log that the configuration is bad and you are defaulting to the min.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will do.

}
return c.mockZoneOperationsGet(project, zone, operation)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest creating the cluster and machine in the code instead of parsing yaml files. Parsing is the CLI tool's job and not the machine actuator's job. Plus, it is easier to test more dynamic scenarios when the objects are not hardcoded in a file somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some complexity here in that "machineactuator.go" expects to decode() the raw bytes of machine.Spec.ProviderConfig. For that reason I was trying to use yaml to setup various scenarios as it requires less changes.

I'm experimenting with some other ideas where the decoding() is decoupled from machineactuator.go. I will likely update the PR with a different approach.

return c.mockZoneOperationsGet(project, zone, operation)
}

func TestOneDisk(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have just not seen it but is there a test for your logic which will set the min disk size if the given disk size is too small?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should add this test. I will do so. Note that this is pre-existing behavior as of a few days ago where another commit effectively set the disk size to 30GB at all times.

@@ -30,4 +30,15 @@ type GCEProviderConfig struct {

// The name of the OS to be installed on the machine.
OS string `json:"os"`
Image string `json:"image"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OS is what eventually maps to the OS Image. Why is this field being introduced in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added as an error with resolving a merge conflict. Removing.

Disks []Disk `json:"disks"`
}

type Disk struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are modifying the the gce provider config, remember to bump all relevant images after merge. eg. the gce-machine-actuator image.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reminder -- I was not aware of this.

@@ -183,3 +185,16 @@ func (MachineSchemeFns) DefaultingFunction(o interface{}) {
// set default field values here
log.Printf("Defaulting fields for Machine %s\n", obj.Name)
}

func ParseMachineYaml(file string) (*Machine, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems awfully specific to the machine actuator tests.

It is assuming that the file contains only one machine object and no other objects. Perhaps it is better to co-locate such logic with the tests and then promote to a common location if other code actually uses it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be removed in a future revision of this PR.

@spew
Copy link
Contributor Author

spew commented May 3, 2018

After revisiting things to rebase on top of latest as well as attempt to use test fixtures / objects for machine/cluster instead of YAML I reworked things quite a bit. I have a prerequisite commit / PR open here:

#130

Until that is through, please ignore this PR:

@spew spew force-pushed the kube-deploy-638 branch from a383c56 to 9afe08f Compare May 9, 2018 20:35
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 9, 2018
@spew
Copy link
Contributor Author

spew commented May 9, 2018

PR 130 is through and committed, so I rebased this commit on top of the latest and it is ready for review again.

@jessicaochen
@krousey

@spew spew force-pushed the kube-deploy-638 branch 2 times, most recently from aae9762 to 0b4e29e Compare May 9, 2018 22:28
Copy link
Contributor

@jessicaochen jessicaochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm (allowing krousey to provide backslash)

for idx, disk := range config.Disks {
diskSizeGb := disk.InitializeParams.DiskSizeGb
if diskSizeGb < minDiskSizeGb {
glog.Info("increasing disk size to %$v gb, the supplied disk size of %v gb is below the minimum", minDiskSizeGb, diskSizeGb)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the %$v intentional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a nit -- definitely a mistake. Will correct as well as rebase on top of the latest changes (they look significant).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. In that case, I will withdraw my approval so the PR dashboard will tell me to take a look after the rebase.

Copy link
Contributor

@jessicaochen jessicaochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. In that case, I will withdraw my approval so the PR dashboard will tell me to take a look after the rebase.

@spew spew force-pushed the kube-deploy-638 branch from 0b4e29e to 6349bd2 Compare May 10, 2018 19:53
@spew
Copy link
Contributor Author

spew commented May 10, 2018

PR is rebased onto the tip of master.

Copy link
Contributor

@jessicaochen jessicaochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm (allowing krousey to provide backslash)

@spew
Copy link
Contributor Author

spew commented May 15, 2018

@krousey Have you had a chance to look at this before I click merge?

@krousey
Copy link
Contributor

krousey commented May 15, 2018

@spew I've been out for a bit. Looking now. Do no click merge.

}

type Disk struct {
InitializeParams DiskInitializeParams `json:"initializeParams"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add this level of indirection? Wouldn't

disks:
  - diskSizeGb: 30
    diskType: "pd-standard"
  - diskSizeGb: 50
    diskType: "pd-standard"

work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be then it would not match the GCE API / structure which is I thought what we wanted to do. It has the InitializeParams structure.

https://cloud.google.com/compute/docs/reference/rest/beta/instances/insert

Do agree what you are proposing is fine too -- let me know if you want to move to that structure.

AutoDelete: true,
Boot: idx == 0,
InitializeParams: &compute.AttachedDiskInitializeParams{
SourceImage: imagePath,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think image path is only for the boot disk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While all disks can have an image, this is a good catch in that the intention of this change was not to put the image on disks other than the boot disk! I'll change this and add a test for this scenario.

return &codec, nil
}

func (codec *GCEProviderConfigCodec) DecodeFromProviderConfig(providerConfig clusterv1.ProviderConfig) (*GCEProviderConfig, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @mkjelland
I think we're going to move to where we have separate config for clusters and machines, and also we're going to have provider status for both of them too. Would it be possible for this API to take a runtime.Object to decode into instead of returning a specific type?

Also, could EncodeToProviderConfig take a runtime.Object as well?

This probably doesn't need to be done in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to take that up in a follow up, more targetted PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take care of this in my PR to create the separate config types. I was thinking it might be nicer to have 2 methods, that way they can return the correct type. Otherwise the caller has to cast the return value to the type they want, right? Or did you have something else in ming @krousey?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of just having an output parameter of type runtime.Object. That way you could do something like

func (codec *GCEProviderConfigCodec) DecodeProviderConfigInto(providerConfig runtime.RawExtension, out runtime.Object) error {...}

var blah GCEProviderConfig
err := c.DecodeProviderConfigInto(m.Spec.ProviderConfig.Value, &blah)

@@ -719,6 +705,28 @@ func (gce *GCEClient) getImagePath(img string) (imagePath string) {
return defaultImg
}

func newDisks(config *gceconfigv1.GCEProviderConfig, zone string, imagePath string, minDiskSizeGb int64) []*compute.AttachedDisk {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This min size should eventually be a validation error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, min size was brought in recent time when we bumped up the min to 30 for custom images, and now for any image (as of a PR a few weeks ago). I was just trying to keep it as it was. I wonder if it should just be removed all together?

checkDiskValues(t, receivedInstance.Disks[1], false, 45, "pd-standard")
}

func checkInstanceValues(t *testing.T, instance *compute.Instance, diskCount int) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, that is really useful.

This change adds the ability for GCP clusters to have custom sized disks
and types.
@spew
Copy link
Contributor Author

spew commented May 15, 2018

Updated post @krousey 's feedback. Noteably:

  1. Reworked the "is boot disk" logic. Now only set the image and minimum size if it is the boot disk.
  2. Updated the tests to ensure that the image is only set for the boot disk.
  3. Started using the Helper() method on Testing.

@spew spew force-pushed the kube-deploy-638 branch from 6349bd2 to 30eb6d1 Compare May 15, 2018 20:48
@krousey
Copy link
Contributor

krousey commented May 16, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 16, 2018
@spew spew merged commit a81a73f into kubernetes-sigs:master May 17, 2018
@spew spew deleted the kube-deploy-638 branch May 17, 2018 03:45
chuckha pushed a commit to chuckha/cluster-api that referenced this pull request Oct 2, 2019
…ker-build

Add verify-docker-build to pre-submit tests
chuckha pushed a commit to chuckha/cluster-api that referenced this pull request Oct 2, 2019
jayunit100 pushed a commit to jayunit100/cluster-api that referenced this pull request Jan 31, 2020
Clusterctl create cluster was not able to properly create the target cluster.  It would sit
in an infinite loop, attempting to clone the VM template.  The fix was to not use annotations
for task ref and vm ref during create.  Instead, use provider status.  Once this fix was in,
the Target cluster would then sit in an infinite loop, attempting to clone the VM template.
The fix was to check for nodeRef in the actuator's Exists() function.

In addition, moved utils from provisioner/common to provisioner/govmomi as it's only ever
used by the govmomi actuator.

Fixes kubernetes-sigs#70
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants