Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd pod spec should include requests for memory, cpu, etc #2195

Closed
bboreham opened this issue Jun 23, 2020 · 11 comments · Fixed by kubernetes/kubernetes#94479
Closed

etcd pod spec should include requests for memory, cpu, etc #2195

bboreham opened this issue Jun 23, 2020 · 11 comments · Fixed by kubernetes/kubernetes#94479
Assignees
Labels
area/etcd help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.
Milestone

Comments

@bboreham
Copy link

bboreham commented Jun 23, 2020

FEATURE REQUEST

The pod spec for etcd does not include any requests:

https://github.com/kubernetes/kubernetes/blob/db8a88721e587cbc54e7081cb88374e253cde4ed/cmd/kubeadm/app/phases/etcd/local.go#L222-L232

This has the following consequences:

  • etcd is not included in kubelet's bookkeeping of how much should run on this node
  • when machine CPU is at 100%, etcd will be throttled to maximum 0.2%
  • when machine is low on memory, etcd will be a candidate to terminate (Not in 1.19+)

kubeadm should add requests for cpu, memory and ephemeral_storage.

It is impossible to pick "correct" values for a component that needs more resources as your cluster gets bigger, but I submit that we can do better by default than zero.

@luxas
Copy link
Member

luxas commented Jun 23, 2020

Thanks for noticing. I think we could set it to something sensible by default, that would fit on a default single-master install of 2 (v)CPUs. If we want to get it into the v1.19 release, I could help doing that code-wise.

WDYT @fabriziopandini @neolit123?

@luxas luxas added area/etcd kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. labels Jun 23, 2020
@neolit123
Copy link
Member

neolit123 commented Jun 23, 2020

kubeadm should add requests for cpu, memory and ephemeral_storage.

what values are you proposing?

when machine is low on memory, etcd will be a candidate to terminate

even with a system-node-critical priority class which is the default in 1.181.19?

@bboreham
Copy link
Author

even with a system-node-critical priority class which is the default in 1.18?

I confess I had missed that (still can't see it in the code, but I'm sure you're right).

And you're also right that critical pods bypass the effect of requests:
https://github.com/kubernetes/kubernetes/blob/da37fcd02d802cd348e68a84e16e254e7d9f3f12/pkg/kubelet/qos/policy.go#L41-L44

Edited description.

@neolit123
Copy link
Member

I confess I had missed that (still can't see it in the code, but I'm sure you're right)

actually, i'm mistaken. it is only in master (1.19):
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/util/staticpod/utils.go#L67
it was agreed that we can backport this to older versions if we see user requests.
so let me know if you'd like to see it in older versions.

etcd is not included in kubelet's bookkeeping of how much should run on this node
when machine CPU is at 100%, etcd will be throttled to maximum 0.2%

these depend on the cluster size, but i think we can at least set the CPU request for the pod to something like 200m.
i can see our GCE e2e tests are doing that, but they are not setting memory and storage.

@RA489
Copy link
Contributor

RA489 commented Jul 6, 2020

what values are you proposing?

I would like to propose CPU request for the pod to atleast 200m and memory to 100Mi and storage to 1Gi.
@neolit123 wdyt?

@neolit123
Copy link
Member

for CPU 200m seems OK, but for memory and storage i think we should leave it to the user to customize.

@bboreham
Copy link
Author

bboreham commented Jul 6, 2020

Fine that the user can change the setting, but it's sensible to start with some non-zero value.

I would go with 100Mi memory and 100Mi disk.
20% of a CPU seems a lot - in a couple of active clusters I own, etcd CPU is around 3%. But it will vary with what you do.
Say 100m cpu?

@neolit123
Copy link
Member

neolit123 commented Jul 6, 2020

one option is to gather more comments either from a SIG mailing list discussion and / or during the next SIG CL meeting.
(given other projects consume kubeadm too).

@kubernetes/sig-cluster-lifecycle

@neolit123
Copy link
Member

ok, it doesn't seem we had any more replies here so let's proceed with @bboreham 's suggestion:

mini-guide if a contributor wants to pick this up:

modify this function to support cpu, memory, disk *string string pointer arguments:
https://github.com/kubernetes/kubernetes/blob/8da7c92a2f046d5700a0a3dec56f133156c29afa/cmd/kubeadm/app/util/staticpod/utils.go#L77
if a string pointer is nil the function should not apply the resource.

update usage of ComponentResources() in the code base to pass utilptr.StringPtr(...)...nil, nil where needed.

make this function:
https://github.com/kubernetes/kubernetes/blob/8da7c92a2f046d5700a0a3dec56f133156c29afa/cmd/kubeadm/app/phases/etcd/local.go#L251

include:

Resources: staticpodutil.ComponentResources(utilptr.StringPtr("100m"), utilptr.StringPtr("100Mi"), utilptr.StringPtr("100Mi")),`

where utilptr is imported as utilptr "k8s.io/utils/pointer"

update unit tests:
e.g. etcd/local_test.go

run:

./hack/update_bazel.sh
./hack/update_gofmt.sh

send the PR.
thank you

/help

@k8s-ci-robot
Copy link
Contributor

@neolit123:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

ok, it doesn't seem we had any more replies here so let's proceed with @bboreham 's suggestion:

mini-guide if a contributor wants to pick this up:

modify this function to support cpu, memory, disk *string string pointer arguments:
https://github.com/kubernetes/kubernetes/blob/8da7c92a2f046d5700a0a3dec56f133156c29afa/cmd/kubeadm/app/util/staticpod/utils.go#L77
if a string pointer is nil the function should not apply the resource.

update usage of ComponentResources() in the code base to pass utilptr.StringPtr(...)...nil, nil where needed.

make this function:
https://github.com/kubernetes/kubernetes/blob/8da7c92a2f046d5700a0a3dec56f133156c29afa/cmd/kubeadm/app/phases/etcd/local.go#L251

include:

Resources: staticpodutil.ComponentResources(utilptr.StringPtr("100m"), utilptr.StringPtr("100Mi"), utilptr.StringPtr("100Mi")),`

where utilptr is imported as utilptr "k8s.io/utils/pointer"

update unit tests:
e.g. etcd/local_test.go

run:

./hack/update_bazel.sh
./hack/update_gofmt.sh

send the PR.
thank you

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Sep 3, 2020
@neolit123 neolit123 added this to the v1.20 milestone Sep 3, 2020
@knight42
Copy link
Member

knight42 commented Sep 3, 2020

/assign

I would like to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/etcd help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants