-
Notifications
You must be signed in to change notification settings - Fork 542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage perf tests decrease volumes and pod number #686
Storage perf tests decrease volumes and pod number #686
Conversation
Hi @mucahitkurt. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test IIUC:
I'm very surprised that the 2nd case is faster than the multiple pods case. |
I just looked at the results at http://perf-dash.k8s.io/#/?jobname=storage-max-emptydir-vol-per-pod&metriccategoryname=E2E&metricname=PodStartup&Metric=pod_startup So those numbers above are for configmap and secret volume types. Downward API and emptydir are closer to:
|
@@ -1,2 +1,2 @@ | |||
TOTAL_PODS: 100 | |||
TOTAL_PODS: 25 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have different configs for different number of volumes/pods, so we can compare side by side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it would be better, it needs additional override files and job definitions. I can do it if it won't cause any side effect for the test infra, cc @wojtek-t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total pods =100 actually I think should stay because it's roughly how many pods per node we support.
This test case is fairly useful in my opinion.
If we want to test 25 too, this should be a separate config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, I mainly want to test lower numbers for debugging purposes and find the threshhold where pod startup latency starts to fall off the cliff.
@msau42 Sorry I read it wrong for the downward API and emptyDir, they are under 5 seconds for the 100 volumes per pod case, and tests are green, I edited the description. |
@@ -1,2 +1,2 @@ | |||
TOTAL_PODS: 100 | |||
TOTAL_PODS: 25 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total pods =100 actually I think should stay because it's roughly how many pods per node we support.
This test case is fairly useful in my opinion.
If we want to test 25 too, this should be a separate config.
@@ -1,2 +1,2 @@ | |||
TOTAL_PODS: 1 | |||
VOLUMES_PER_POD: 100 | |||
VOLUMES_PER_POD: 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have strong opinion about this one. I would like to avoid having donzens of jobs and growing this matrix of scenarios, but I don't know if we should have 10 or 100 here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msau42 - WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Realistically I don't think a single pod will have 100 volumes. 5-10 seems more realistic, but the initial purpose of this experiment is to find our current scaling thresholds.
I think emptydir and downward API volumes performance is acceptable. Unsure about secrets and configmaps. Again I want to find the threshold where performance starts to fall off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think emptydir and downward API volumes performance is acceptable. Unsure about secrets and configmaps. Again I want to find the threshold where performance starts to fall off.
I think at least part of the problem is in QPS limits - we need to GET each secret/configmap from apiserer before we even start the pod. So with 5qps limit, 100 volumes just means 20s at start.
I think that what we should seriously consider (at least as an experiment) is to bump it to something large (say 100+) and see what will happen. I would even say that it's probably first thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it meaningful to add the wait time for configmap and secret volumes be available to pod startup time? Should we add some barrier to wait they are available before starting to pod creation?
@mucahitkurt this initial data is very useful! As we start to change/add new thresholds, can we keep a spreadsheet that summarizes the results? |
+100 |
I'm looking into how I can do this |
2b06b07
to
672bfd7
Compare
For test results data, I wrote a script like this and fetched the test results for ephemeral volumes podstartup latencies. I created a json file for the test results and added it to the PR, I thought that we can use it easily for further analysis, it's size is 1MB, I'm not sure whether this's a problem or not. I created a different override file for the I'm not sure about the final decision about |
...testing/experimental/storage/pod-startup/max_volumes_per_node/25_pods_per_node/override.yaml
Show resolved
Hide resolved
...testing/experimental/storage/pod-startup/max_volumes_per_node/25_pods_per_node/override.yaml
Show resolved
Hide resolved
672bfd7
to
fd6d4fd
Compare
number-of-node: 1 | ||
volume-types: | ||
volume-type: configmap | ||
before-qps-limit-bump: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of "before-qps-limit-bump" I would add specific value for qps-limit (default is 5, I bumped to 100)
The reason is that we may want to experiment with other values too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100 or 10?
perc50: 22240.349315 | ||
perc90: 22240.349315 | ||
perc99: 22240.349315 | ||
test-artifacts: https://gcsweb.k8s.io/gcs/kubernetes-jenkins/logs/ci-kubernetes-storage-scalability-max-configmap-vol-per-pod/1156085407868260353/artifacts/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is tricky - because this dir (depending on the time) contains results from before-qps-bump, after it (and once new changes appear additional configurations).
So technically it would be more correct to move it to the level of "volume-type".
[So what i have on my mind is the structure:
volume-types:
- volume type: configmap
test-artifacts: ...
results:
param1: value1 (e.g. qps)
param2: value2
pod-startup:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think test artifacts link points to all test run artifacts of the volume type? If so, actually I take a specific test run for the percentiles, because it seems that most of them have almost similar results, and the artifact link points to that build, e.g. a specific test run.
So if it's not correct taking one test run instead of an average of many, can it be like below?
- volume type: configmap
results:
test-artifacts: ...
param1: value1 (e.g. qps)
param2: value2
pod-startup:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I got it wrong, I'll take the volume type as root and add different test scenario results under it, like below;
od-with-volumes-startup-latency-scenarios:
volume-type: configmap
test-artifacts: https://gcsweb.k8s.io/gcs/kubernetes-jenkins/logs/ci-kubernetes-storage-scalability-max-configmap-vol-per-pod/
results:
scenario-1:
number-of-volumes-per-pod: 100
number-of-pods-per-node: 1
number-of-node: 1
qps-limit: 5
pod-startup:
perc50: 22240.349315
perc90: 22240.349315
perc99: 22240.349315
scenario-2:
number-of-volumes-per-pod: 100
number-of-pods-per-node: 1
number-of-node: 1
qps-limit: 10
pod-startup:
perc50: 2205.065107
perc90: 2205.065107
perc99: 2205.065107
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - that's exactly what I had on my mind - sorry for not being clear
fd6d4fd
to
753a6a6
Compare
number-of-volumes-per-pod: 100 | ||
number-of-pods-per-node: 1 | ||
number-of-node: 1 | ||
qps-limit: 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100
Let me clarify, because in fact I was changing two different things separately:
- first I bumped kube-api-qps limit in kubelet from 5 to 100: Bump qps limits for single-node tests test-infra#13679
- then I reduced speed of test from creating pods from ~100/s to 10/s: Create pods slower in storage tests #720
These two things are mixed together in this file.
@@ -0,0 +1,170 @@ | |||
pod-with-volumes-startup-latency-scenarios: | |||
volume-type: configmap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is this a valid yaml?
I think that for slices/arrays, you have to have "-" in yaml. I mean:
- volume-type
[not that this is particularly important]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that's not valid, I try to create a valid one, thanks for heads up!
753a6a6
to
9754eef
Compare
- number-of-volumes-per-pod: 1 | ||
number-of-pods-per-node: 100 | ||
number-of-node: 1 | ||
kube-api-qps-limit: 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also switched from generating pods ~100pods/s to generating pods 10pods/s.
I think this should also be catched in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, I added that info as qps-pod-throughput
9754eef
to
34aa066
Compare
@@ -0,0 +1,3 @@ | |||
PODS_PER_NODE: 25 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also add TEST_NAME (as it is here):
https://github.com/kubernetes/perf-tests/blob/master/clusterloader2/testing/experimental/storage/pod-startup/max_volumes_per_node/override.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
one last comment - lgtm once applied thanks! |
…s, add test results json Signed-off-by: Mucahit Kurt <[email protected]>
34aa066
to
8f4d242
Compare
@wojtek-t Should I wait for test-suite implementation to add this test case to periodic jobs? Otherwise I should define additional 4 jobs(I think PVC side is not activated yet). |
Let's wait - I think we're pretty close. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mucahitkurt, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
We want to run our experimental tests with the new volumes and pod numbers;
and compare with the current test cases' pod_startup time results.
Current pod_startup time for p90 changes between 70 seconds-90 seconds for max volumes per node case.
Current pod_startup time for p90 changes between 22 seconds-24 seconds for configMap and secret, 2.5-3.75 seconds for emptyDir and Downward API for max volumes per pod case.
/sig storage
cc @wojtek-t
cc @msau42