add persistent volume scale tests #657

hantaowang · 2019-07-19T00:15:33Z

Experimental storage tests to test startup latency of booting pods with PVs. I tried staying as close to @masu's KEP as possible.

This PR depends on functionality from #656. I have it open so we can parallelize the review for both.

1_node_scale_by_pod
This test runs on 1 node, 100 pods per node, and 1 PV per pod. This stresses # of PVs per node and # of pods per node.

1_node_scale_by_pvcs_per_pod
This test runs on 1 node, 1 pod per node, and 100 PVs per pod. This stresses # of PVs per node and # of PVs per pod.

cluster_load_scale_by_nodes
This test runs on X nodes (X from 1 to 5000+). It has a fixed 10 pods per node and 1 PV per pod. This test simulates a realistic large cluster with a normal amount of PVs per node.

TODO (In future PR):

Implement the ability to boot the GCE PD CSI Driver to use as the storage backend.
Write test to boot a large cluster with a large amount of PVs, and the boot a couple more pods w/ PVs in this saturated cluster.

/assign @wojtek-t @davidz627
cc @msau42 @jingxu97 @verult

k8s-ci-robot · 2019-07-19T00:15:41Z

Hi @hantaowang. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

davidz627 · 2019-07-19T00:18:30Z

/ok-to-test

hantaowang · 2019-07-19T00:20:36Z

/hold

hantaowang · 2019-07-19T23:42:07Z

Which tuning set do you recommend? And for the parallelism limited ts, what parallel limit should I set? I wonder if setting it equal to the # of namespaces is reasonable.

davidz627 · 2019-07-22T21:03:00Z

/assign @msau42

davidz627 · 2019-07-22T21:06:34Z

clusterloader2/testing/experimental/storage/pod-startup/persistent-volumes/config.yaml

+  {{$PVC_PER_POD := DefaultParam .PVC_PER_POD 2}}
+  {{$ENABLE_CHAOSMONKEY := DefaultParam .ENABLE_CHAOSMONKEY false}}
+  {{$TEST_NAME := DefaultParam .TEST_NAME "generic-test"}}
+  {{$PVC_SIZE := DefaultParam .PVC_SIZE "8Gi"}}


Why did we pick 8Gi

Its a nice factor of 2 :)

In honestly there is not a reason behind it other than it seems like a reasonable choice for a single mid sized PV to be used as the default.

davidz627 · 2019-07-22T21:06:53Z

clusterloader2/testing/experimental/storage/pod-startup/persistent-volumes/config.yaml

+  {{$TEST_NAME := DefaultParam .TEST_NAME "generic-test"}}
+  {{$PVC_SIZE := DefaultParam .PVC_SIZE "8Gi"}}
+  {{$PER_PVC_BOUND_SLO := DefaultParam .PVC_BOUND_SLO 2}}
+  {{$POD_STARTUP_SLO := DefaultParam .POD_STARTUP_SLO 100}}


why default to 100 then set to 300 in every test?

I changed it, but this number is not set yet until I figure out a reasonable SLO for each test. I override it to 300 because that seemed reasonable enough as an upperbound for testing purposes.

davidz627 · 2019-07-22T21:07:21Z

clusterloader2/testing/experimental/storage/pod-startup/persistent-volumes/config.yaml

+  {{$ENABLE_CHAOSMONKEY := DefaultParam .ENABLE_CHAOSMONKEY false}}
+  {{$TEST_NAME := DefaultParam .TEST_NAME "generic-test"}}
+  {{$PVC_SIZE := DefaultParam .PVC_SIZE "8Gi"}}
+  {{$PER_PVC_BOUND_SLO := DefaultParam .PVC_BOUND_SLO 2}}


is this in seconds? 2 seconds binding seems low. Not sure if this includes provisioning time or not though

Provisioning is done in parallel to a degree. 2 sec is low if you are not provisioning that many, which is why I use a PVCBoundTime of min 60 sec.

sorry still confuse about the meaning of this parameter and the set of "2"?

It just means that because the backend can only provision a certain number of PVs at a time and thus is to a degree serial, I set an SLO for the per PV provision time at 2 seconds. So 100 PVs has a total SLO of 2 * 100 = 200 seconds.

I think 2 seconds is too fast, especially while we're in an experimental phase and are testing our limits. Let's do something like 30 seconds?

We're also limited by qps of k8s api server and gce api

davidz627 · 2019-07-22T21:08:49Z

clusterloader2/testing/experimental/storage/pod-startup/persistent-volumes/config.yaml

+            templateFillMap:
+              PVCsPerPod: {{$PVC_PER_POD}}
+  - measurements:
+      - Identifier: WaitForRunningPVCOwningPods


naming nit: s/PVCOwningPods/PodsWithPVC?

wojtek-t · 2019-07-23T11:00:17Z

clusterloader2/testing/experimental/storage/pod-startup/persistent-volumes/config.yaml

+        Params:
+          action: start
+          nodeMode: {{$NODE_MODE}}
+  # Create PVCs


I'm wondering if we really need this file.
This test is extremely similar to: https://github.com/kubernetes/perf-tests/blob/master/clusterloader2/testing/experimental/storage/pod-startup/ephemeral-volumes/cluster-wide/config.yaml

The only difference is that instead of creating ephemeral storage we create PVCs and wait for them.
But it should be possible to unify them (if you wait for PVC in ephemeral-storage scenario, it will be no-op anyway).

So instead of having two separate files to maintain, I would much rather prefer unifying those two together - I don't think the differences justify having a separate one.

@mucahitkurt - as the one who added the test I mentioned above.

I would be happy to merge the two configs (in a separate PR). There definitely are similarities although I think the PV config will be a little bit more config due to the WaitForBoundPVCs and such.

I think the WaitForBoundPVCs is the only difference. And just adding it won't break ephemeral volumes, because simple there won't be anything to wait for there.
So unless I'm not missing anything, I'm against having two separate configs and merging them later - we should just do the right thing from the beginning.

I would much prefer the base config be the one I wrote. There is a difference in that my config has an option to scale automatically based on # of nodes and overrides is used for different test variations (defined by the KEP), whereas @mucahitkurt’s work operates similar to configs for load and density where overrides is used to define the size of the cluster.

Its ok imo to not merge it yet, but it would take some time to work together and refactor how the configs work. Idealy, it should be the same base woth maybe a flag for pv vs ephemeral.

@mucahitkurt’s work operates similar to configs for load and density where overrides is used to define the size of the cluster.

Sorry If I got it wrong, but actually I don't use overrides for the cluster size, e.g. 1_node and cluster_wide case, I use different configs under 1_node and cluster-wide, I started with one config but got feedback on the related PR to separate the configs for those cases for the sake of simplicity over duplication. But of course we can think that cluster-wide config already covers the 1 node case, as you already did. I think it would be better to use the same convention, one config or 2 configs.

I'm not sure if it will cause to some limitations, but may be it would be better if we can get rid of root ephemeral-volumes directory and use the PVC as a different volume type

I'm not sure if it will cause to some limitations, but may be it would be better if we can get rid of root ephemeral-volumes directory and use the PVC as a different volume type

Yes - we probably should.

I would much prefer the base config be the one I wrote. There is a difference in that my config has an option to scale automatically based on # of nodes and overrides is used for different test variations (defined by the KEP), whereas @mucahitkurt’s work operates similar to configs for load and density where overrides is used to define the size of the cluster.

I'm not trying to say you can't modify @mucahitkurt config - we definitely need to. And I'm completely fine with modify something there.
I'm also fine with replacing the one added by @mucahitkurt with this one (though I need to review it more carefully at that point). I just don't see any reason to have both at this point.

With #688 merged, I think this file is no longer needed - just that one requires changes. Can you please update this PR?

jingxu97 · 2019-07-24T19:53:27Z

clusterloader2/testing/experimental/storage/pod-startup/persistent-volumes/config.yaml

+  {{$PVC_SIZE := DefaultParam .PVC_SIZE "8Gi"}}
+  {{$PER_PVC_BOUND_SLO := DefaultParam .PVC_BOUND_SLO 2}}
+  {{$POD_STARTUP_SLO := DefaultParam .POD_STARTUP_SLO 300}}
+  {{$OBJ_CREATION_QPS := DefaultParam .OBJ_CREATION_QPS 10}}


what is for?

OBJ_CREATION_QPS is a configurable variable that describes how fast to create API objects at. The default value of 10 QPS means to create it 10 queries per second. It is used to configure the UniformQPS tuning set (a thing that is responsible for creating k8s objects).

@msau42 @wojtek-t any thoughts on what the most appropriate tuning set to test with is? I am actually not sure if sequence of QPS means in a single namespace or across all namespaces.

so it limits how many objects in total to create in one second, which includes PVC/PV, Pods etc?

yes thats right

hantaowang · 2019-07-25T01:25:16Z

Once #656 is merged, please look at #688 so I can rebase this PR to use the same config as the ephemeral tests (Although I expect there will be 1 more PR after #688 to change that config into something that is more of a middle group between the current one and mine).

hantaowang · 2019-07-25T20:47:20Z

rebased and ready for review
/hold cancel
I incorporated some aspects of my original config that I felt were improvements.

wojtek-t

High-level comment is that we don't want to expose to many knobs - it's just much harder to follow and if noone will be using them, it simply doesn't make sense - we can always extend the test.

wojtek-t · 2019-07-26T12:41:19Z

clusterloader2/testing/experimental/storage/pod-startup/config.yaml

+  {{$ENABLE_CHAOSMONKEY := DefaultParam .ENABLE_CHAOSMONKEY false}}
+  {{$TEST_NAME := DefaultParam .TEST_NAME "generic-test"}}
+  {{$GROUP := "pod-with-volumes-startup-latency"}}
+  {{$TUNING_SET := DefaultParam .TUNING_SET "UniformQPS"}}


Let's remove this - I don't think we want to allow such significant configuration

Most removed, I left test-name but I can remove that too.

clusterloader2/testing/experimental/storage/pod-startup/config.yaml

wojtek-t · 2019-07-26T12:43:28Z

clusterloader2/testing/experimental/storage/pod-startup/config.yaml

+  {{$VOLUMES_PER_POD := .VOLUMES_PER_POD}}
+  {{$VOLUME_TEMPLATE_PATH := .VOLUME_TEMPLATE_PATH}}
+  {{$PROVISION_VOLUME := DefaultParam .PROVISION_VOLUME false}}
+  {{$VOL_SIZE := DefaultParam .VOL_SIZE "8Gi"}}


Do we really want to expose an option for it? I vote for being opinionated and setting it just to 8G.

Spoke with team, and we want to keep this know because different storage options have different min sizes (and will fail if you try to provision less). Also, different sizes have different provision times.

Zonal PV: min 5G
Regional PV: min 200 G
Filestore: min 1T

This discussion also brought up an interesting point. We are not measuring provisioning time because we said that is bootstrapping behavior that only happens the first time.

We also format the filesystem the first time we mount the volume on any node and that is something this test will currently measure during pod startup. Is it something we want to include? If not, then we would need to bring up pods twice and only measure the second time.

Great point - I opened #706 to not block this PR on it.

clusterloader2/testing/experimental/storage/pod-startup/config.yaml

wojtek-t · 2019-07-26T12:45:56Z

clusterloader2/testing/experimental/storage/pod-startup/config.yaml

-{{$GROUP := "pod-with-ephemeral-volume-startup-latency"}}
+# SLO Variables
+  {{$POD_STARTUP_SLO := DefaultParam .POD_STARTUP_SLO 300}}
+  {{$PER_PVC_BOUND_SLO := DefaultParam .PVC_BOUND_SLO 2}}


We don't have any SLO for pvc binding - let's not call it that way.
PVC_BINDING_TIME maybe?

The other comment - do we really need it? Can't we hardcode it?

Depending on the PV storage option, hard coding will break the tests. Filestore takes forever to provision.

wojtek-t · 2019-07-26T12:48:55Z

clusterloader2/testing/experimental/storage/pod-startup/config.yaml

+  {{$PARALLELISM_LIMITED := DefaultParam .PARALLELISM_LIMITED 1}}
+  {{$ENABLE_CHAOSMONKEY := DefaultParam .ENABLE_CHAOSMONKEY false}}
+  {{$TEST_NAME := DefaultParam .TEST_NAME "generic-test"}}
+  {{$GROUP := "pod-with-volumes-startup-latency"}}


Let's not introduce unnecessary things.
Please use some shorter name group = volume seems good enough, and hardcode it in necessary places.

clusterloader2/testing/experimental/storage/pod-startup/config.yaml

mucahitkurt · 2019-07-26T20:04:35Z

...er2/testing/experimental/storage/pod-startup/volume-types/persistentvolume/pod_with_pvc.yaml

+{{$appName := .AppName}}
+{{$volumesPerPod := .VolumesPerPod}}
+apiVersion: v1
+kind: Pod


I think the same concern about using raw volumes instead of a controller is valid for this one.

can you wait on that PR until this one is merged?

I agree - let's have this one merged and then migrate to deployments all in one shot.

mucahitkurt · 2019-07-26T22:36:47Z

clusterloader2/testing/experimental/storage/pod-startup/config.yaml


-name: pod-with-ephemeral-volume-startup-latency
+
+name: pv-storage-{{$TEST_NAME}}-{{$VOLUME_TYPE}}


since pv doesn't cover ephemeral, it might be better just starting with storage- or volume-

ack, set to storage-

hantaowang · 2019-07-27T09:08:41Z

will squash later

clusterloader2/testing/experimental/storage/pod-startup/config.yaml

wojtek-t · 2019-07-29T09:40:59Z

clusterloader2/testing/experimental/storage/pod-startup/config.yaml

+  {{$WAIT_FOR_PVC := DefaultParam .WAIT_FOR_PVC false}}
+  {{$POD_STARTUP_TIMEOUT := DefaultParam .POD_STARTUP_TIMEOUT "15m"}}
+  {{$POD_STARTUP_SLO := DefaultParam .POD_STARTUP_SLO 300}}
+  {{$PVC_BINDING_TIME := DefaultParam .PVC_BOUND_SLO 2}}


Do we really need to expose this?

I can remove this know for now, but we may need it later if we do other types of PD.

clusterloader2/testing/experimental/storage/pod-startup/config.yaml

wojtek-t · 2019-07-29T09:42:57Z

clusterloader2/testing/experimental/storage/pod-startup/config.yaml

        VolumesPerPod: {{$VOLUMES_PER_POD}}
-        AppName: {{$APP_NAME}}
+        AppName: Sequence


Why is this AppName used for?

Can't we just use Name instead of it?

ack, this shouldnt even be here

wojtek-t · 2019-07-29T09:43:52Z

...er2/testing/experimental/storage/pod-startup/volume-types/persistentvolume/pod_with_pvc.yaml

+{{$appName := .AppName}}
+{{$volumesPerPod := .VolumesPerPod}}
+apiVersion: v1
+kind: Pod


I agree - let's have this one merged and then migrate to deployments all in one shot.

wojtek-t · 2019-07-29T09:44:08Z

clusterloader2/testing/experimental/storage/pod-startup/volume-types/persistentvolume/pvc.yaml

+    requests:
+      storage: {{.VolSize}}
+  persistentVolumeReclaimPolicy: Delete
+  volumeBindingMode: Immediate


nit: add end-of line

Reclaim policy and binding mode are fields on the storageclass, not PVC.

Until we have a way to measure and subtract out provisioning time, a prerequisite of running this test using default storageclass is that the default storageclass needs to have binding mode = immediate.

I removed those two lines and added an assumption to the top.

hantaowang · 2019-07-29T17:59:15Z

@wojtek-t I've squashed the commits because I think this is in a good place to merge now. Please give your final comments or approval.

hantaowang · 2019-07-29T21:17:28Z

/retest

wojtek-t · 2019-07-30T07:09:44Z

/lgtm
/approve

k8s-ci-robot · 2019-07-30T07:09:54Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hantaowang, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~clusterloader2/OWNERS~~ [wojtek-t]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot assigned davidz627 and wojtek-t Jul 19, 2019

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 19, 2019

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 19, 2019

k8s-ci-robot requested review from oxddr and wojtek-t July 19, 2019 00:15

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 19, 2019

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 19, 2019

hantaowang force-pushed the pv-storage-tests branch from e01d378 to b6f9857 Compare July 19, 2019 22:45

k8s-ci-robot assigned msau42 Jul 22, 2019

davidz627 reviewed Jul 22, 2019

View reviewed changes

wojtek-t reviewed Jul 23, 2019

View reviewed changes

hantaowang force-pushed the pv-storage-tests branch from b6f9857 to 202ef25 Compare July 23, 2019 20:54

jingxu97 reviewed Jul 24, 2019

View reviewed changes

hantaowang force-pushed the pv-storage-tests branch 2 times, most recently from ac6bbab to e47a113 Compare July 25, 2019 00:58

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 25, 2019

hantaowang force-pushed the pv-storage-tests branch from e47a113 to 67c668d Compare July 25, 2019 01:04

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 25, 2019

hantaowang force-pushed the pv-storage-tests branch from 67c668d to 9fb4f66 Compare July 25, 2019 20:46

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 25, 2019

wojtek-t reviewed Jul 26, 2019

View reviewed changes

mucahitkurt reviewed Jul 26, 2019

View reviewed changes

wojtek-t reviewed Jul 29, 2019

View reviewed changes

hantaowang force-pushed the pv-storage-tests branch from 48eef14 to 69472bc Compare July 29, 2019 17:56

add persistent volume scale tests

166fffe

hantaowang force-pushed the pv-storage-tests branch from 69472bc to 166fffe Compare July 29, 2019 17:58

This was referenced Jul 29, 2019

add cluster load scale by node test for pv kubernetes/test-infra#13666

Closed

add per pod and per node tests for pv kubernetes/test-infra#13667

Merged

hantaowang mentioned this pull request Jul 29, 2019

add functionality to deploy cloud-sa secret and driver #705

Merged

mucahitkurt mentioned this pull request Jul 30, 2019

Use deployment instead of raw pod for storage perf tests #687

Merged

wojtek-t mentioned this pull request Jul 30, 2019

Should volume formatting be included in pod startup time? #706

Open

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 30, 2019

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 30, 2019

k8s-ci-robot merged commit c7ed37a into kubernetes:master Jul 30, 2019


		name: pod-with-ephemeral-volume-startup-latency

		name: pv-storage-{{$TEST_NAME}}-{{$VOLUME_TYPE}}

add persistent volume scale tests #657

add persistent volume scale tests #657

Conversation

hantaowang commented Jul 19, 2019 • edited Loading

k8s-ci-robot commented Jul 19, 2019

davidz627 commented Jul 19, 2019

hantaowang commented Jul 19, 2019

hantaowang commented Jul 19, 2019

davidz627 commented Jul 22, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hantaowang Jul 23, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mucahitkurt Jul 24, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hantaowang commented Jul 25, 2019

hantaowang commented Jul 25, 2019 • edited Loading

wojtek-t left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hantaowang commented Jul 27, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hantaowang commented Jul 29, 2019

hantaowang commented Jul 29, 2019

wojtek-t commented Jul 30, 2019

k8s-ci-robot commented Jul 30, 2019

hantaowang commented Jul 19, 2019 •

edited

Loading

hantaowang Jul 23, 2019 •

edited

Loading

mucahitkurt Jul 24, 2019 •

edited

Loading

hantaowang commented Jul 25, 2019 •

edited

Loading